I. Description of project

I.A. Hypotheses

Q1: Do reference and restored sites exhibit differences in pollinator biodiversity?

H0:
• no significant difference between reference and restored sites in pollinator biodiversity → pollinators are highly mobile and can easily colonize restored habitats, thereby achieving similar biodiversity and composition to reference sites.

H1:
• pollinator diversity is higher in reference sites compared to young restored sites → habitat complexity, trophic levels, vegetation structure, and resource availability, all take time to develop in restored sites

Q2: Do traditional and automated methods yield different compositions of pollinators?

H0:
• no significant differences in pollinator composition detected by traditional and automated methods → because all methods aim to capture a representative sample of the pollinator community.

H1:
• Traditional and automated methods yield different compositions of pollinators → due to inherent biases in sampling efficiency, taxonomy resolution, and target taxa, leading to discrepancies in the detection of certain pollinators.

Q3: Can automated methods effectively detect biodiversity changes between reference and restored sites?

H0:
• Automated methods do not effectively detect biodiversity changes between reference and restored sites → due to limitations in capturing the full spectrum of pollinator abundance and species richness, as well as potential challenges in achieving high taxonomic resolution.

H1:
• Automated methods can effectively detect biodiversity differences between reference and restored sites → machine learning algorithms can offer improved sensitivity and accuracy in detecting shifts in pollinator communities over time.

II. Setup

II.A. libraries

II.B. aesthetics

#define custom colors for plotting

#custom_colors <- c("Young Restored" = "#9BB655FF", "Reference" = "#1F78B4")

#saturated_pal <- c(
  #reference site "DES" = "#7F3B19FF",    "HLI" = "#FDB863FF",    "JEP" = "#E08214FF",    "STP" = "#B35806FF",  "WUP" = "#FEE0B6FF",  
  # restored sites  "BUH" = "#93C6E1FF",   "KOT" = "#5F93ACFF",    "WDG" = "#2E627AFF",    "WED" = "#00344AFF"  )

saturated_pal <- c(
  #reference sites
  "DES" = "#93C6E1FF",  
  "HLI" = "#5F93ACFF",  
  "JEP" = "#2E627AFF",  
  "STP" = "#00344AFF",  
  "WUP" = "#1F78B4",
  # restored sites
  "BUH" = "#5C7424",  
  "KOT" = "#C3D69B",  
  "WDG" = "#9BB655",  
  "WED" = "#6D8F3C"  
)

# bicolor palette
bicolor_pal <- c(
  "DES" = "#1F78B4",  
  "HLI" = "#1F78B4",  
  "JEP" = "#1F78B4",  
  "STP" = "#1F78B4",
  "WUP" = "#1F78B4",
  "BUH" = "#9BB655FF",  
  "KOT" = "#9BB655FF",  
  "WDG" = "#9BB655FF",  
  "WED" = "#9BB655FF" 
)

bicolor_bg <- c(
  "DES" = "lightblue",  
  "HLI" = "lightblue",  
  "JEP" = "lightblue",  
  "STP" = "lightblue",
  "WUP" = "lightblue",
  "BUH" = "#D3D5AEFF",  
  "KOT" = "#D3D5AEFF",  
  "WDG" = "#D3D5AEFF",  
  "WED" = "#D3D5AEFF" 
)

theme_new <- theme_classic(base_size = 12) +
  theme(axis.line = element_blank(),
        axis.text = element_text(colour = "black"),
        axis.ticks = element_line(linewidth = 0.4, colour = "black"),
        legend.key.size = unit(0.5, "cm"),
        legend.margin = margin(t = 0),
        legend.text = element_text(size = 8),
        legend.title = element_text(size = 9),
        panel.border = element_rect(linewidth = 0.4, colour = "black", fill = NA),
        panel.grid.major.y = element_line(colour = "grey90", linewidth = 0.2),
        plot.margin = margin(2, 2, 2, 2, "pt"),
        plot.title = element_text(size = 12))
theme_set(theme_new)

Predictor colors

Setting predictor colors allows us to make unified plots with the same colors for the same variables in the modelling steps. Now in order to call one of the colors, we can do as follows: predictor_colors["Floral_simpson_index_T"]. RColorBrewer::Spectral

# Select 7 colors from the a palette
predictor_colors <- paletteer::paletteer_d("RColorBrewer::Spectral")[1:9]

# Assign colors to the predictor variables
predictor_colors <- c(
  "dm_temperature"= "#F8A02EFF",
  "Floral_simpson_index_T"= "#4F3855FF",
  "Floral_simpson_index_site"= "#3F3955FF",
  "minutes_since_9am" ="black",
  "top2_ratio"= "#254100FF",
  "Plot_Cover_T" = "#9D3A5EFF",
  "Days_since_start" = "#808080FF",
  "dm_wind_velocity" ="#4987A0FF", 
  "rec_time_min" = "#263D5DFF",
  "average_flower_cover" ="#9D3A5EFF")

# View the named color mapping
print(predictor_colors)
##            dm_temperature    Floral_simpson_index_T Floral_simpson_index_site 
##               "#F8A02EFF"               "#4F3855FF"               "#3F3955FF" 
##         minutes_since_9am                top2_ratio              Plot_Cover_T 
##                   "black"               "#254100FF"               "#9D3A5EFF" 
##          Days_since_start          dm_wind_velocity              rec_time_min 
##               "#808080FF"               "#4987A0FF"               "#263D5DFF" 
##      average_flower_cover 
##               "#9D3A5EFF"
barplot(
  rep(1, length(predictor_colors)),
  col = predictor_colors,
  names.arg = names(predictor_colors),
  las = 2,  # rotate labels
  cex.names = 0.8,
  main = "Predictor Color Preview"
)

One option: MetBrewer::Derain

predictor_colors <- paletteer::paletteer_d(“MetBrewer::Derain”)[1:7]

Assign colors to the predictor variables predictor_colors <- setNames(predictor_colors, c( “dm_temperature”, “top2_ratio”, “Plot_Cover_T”, “Floral_simpson_index_T”, “minutes_since_9am”, “Days_since_start”, “dm_wind_velocity” ))

One option: MetBrewer::Nattier predictor_colors <- paletteer::paletteer_d(“MetBrewer::Nattier”)[1:8]

Assign colors to the predictor variables predictor_colors <- setNames(predictor_colors, c( “dm_temperature”, “Floral_simpson_index_T”, “minutes_since_9am”, “top2_ratio”, “Plot_Cover_T”, “Days_since_start”, “dm_wind_velocity”, “rec_time_min” ))

II.C. data loading

### Environmental data ------------

## landscape data from "C:\Users\Almas\Desktop\UNI_LEIPSI\Thesis\Thesis_Rproject\data\landscape_data.csv"
landscape_data <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/corine_data.csv")

## weather data from the Deutscher Wetterdienst (DWD)
dwd_weather <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/dwd_weather_data.csv")

## weather data from the field
field_weather <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/field_weather_data.csv")

## plant data
plants <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/plants3.csv")
relative_flower <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/relative_flower.csv")
top2 <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/top2.csv")

## extrapolated plant data
#inext_plants <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/iNext_AsyEst_plants.csv")

### Traditional methods ------------

## netting data 
netting <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/net_data_long.csv")

## pan trap data for families
pan_family <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/bowltrap_clean.csv")

### Cameras ------------

## flower camera data
flower_camera <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/bioclip_flower_cams.csv")

## platfrom camera data 
platform_camera <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/InsectDetect_platform_cams.csv")
platform_logs_rec <- read.csv("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/platform_recording_logs.csv")

III. EDA & data cleaning

III.A. Enivronmental variables

III.A.1. Dwd weather data

#change dwd_weather date format from 20240724 to 2024-07-24 by adding a "-" after the year and month
dwd_weather <- dwd_weather %>%
  #adding a "-" after the year and month
  mutate(Date = gsub("(\\d{4})(\\d{2})", "\\1-\\2", MESS_DATUM))%>%
  #adding - after the month
  mutate(Date = gsub("(\\d{4}-\\d{2})(\\d{2})", "\\1-\\2", Date))

#select the relevant columns for out analysis: Date, SITE, FM, TMK and rename them properly dm= daily mean
dwd_weather <- dwd_weather %>%
  dplyr::select("Date"="Date",
         "Site"="SITE",
         "dm_wind_velocity"="FM",
         "dm_temperature"="TMK")

We will not keep the weather data written down by hand in the field, as it is not as reliable as the data from the DWD weather station. However, an interesting variable to keep is the start time of walking the transect for the netting method. This could be an important factor in the number of pollinators caught, as some species are more active in the morning or evening.

III.A.2. Field weather data - Start time of netting

#transform field_weather into start_net dataframe
start_net <- field_weather %>%
  dplyr::select(
         "Date"="date",
         "Site"="SITE",
         "Transect"="transect",
         "Start_time"="start_time")

start_net <- start_net %>%
  mutate(
    Start_time = gsub("\\.", ":", Start_time),  # Replace "." with ":"
    Start_time = ifelse(Start_time == "9:25", "09:25", Start_time),  # Fix specific case
    Start_time = ifelse(nchar(Start_time) == 4, paste0(Start_time, "0"), Start_time),  # Ensure "14:0" -> "14:00"
    Start_time = ifelse(nchar(Start_time) == 2, paste0(Start_time, ":00"), Start_time),  # Ensure "14" -> "14:00"
    Hour = as.integer(substr(Start_time, 1, 2)),  # Extract hour
    Minute = as.integer(substr(Start_time, 4, 5)))%>%  # Extract minutes
    #add new column minutes since 9 am
    mutate(minutes_since_9am = (Hour - 9) * 60 + Minute)%>%
  #remove unnecessary columns
  dplyr::select(-c("Hour", "Minute", "Start_time"))
    
     

#barplot start time per site, colored by transect
start_net %>%
    ggplot(aes(x = Site, y = minutes_since_9am, fill = Transect)) +
  geom_bar(stat = "identity", position= "dodge") +
  labs(title = "Start time of transect walk per site",
       x = "Site",
       y = "Time (minutes since 9 am)") +
  theme(legend.position = "left")+
  #viridis
  scale_fill_viridis_d() 

write.csv(start_net, "C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/start_net.csv", row.names = FALSE)

III.A.3. Landscape data & combining

Floral_simpson_index_T <- relative_flower %>%
  dplyr::select("Site"="Site",
         "Transect"="Transect",
         "Site_type"="Site_type",
         "Floral_simpson_index"="Floral_simpson_index")%>%
  distinct()%>%
  #new column with average simpson per transect
  group_by(Site, Transect, Site_type) %>%
  summarise(Floral_simpson_index_T = sum(Floral_simpson_index)/3)%>%
  ungroup()
## `summarise()` has grouped output by 'Site', 'Transect'. You can override using
## the `.groups` argument.
planty <- relative_flower %>%
  dplyr::select(Site, Plot_Cover, Transect, average_flower_cover)%>%
  #average the cover per transect
  group_by(Site, Transect,average_flower_cover) %>%
  summarise(Plot_Cover_T = mean(Plot_Cover, na.rm = TRUE))
## `summarise()` has grouped output by 'Site', 'Transect'. You can override using
## the `.groups` argument.
envir_data <- start_net %>%
  #select the relevant columns
  #dplyr::select(-c("Start_time", "Time_Bin"))%>%
  
  #join the dwd weather data
  full_join(dwd_weather, by = c("Date" = "Date", "Site"))%>%
  
  #join the landscape data
  full_join(landscape_data, by = c("Site" = "Site"))%>%
  
  #join the top2 data
  full_join(top2, by = c("Site" = "Site", "Transect")) %>%
  dplyr::select(-c( "Site_type", "average_flower_cover"))%>% #site_type and average_flower_cover from the top2 is lacking some rows, we'll get this variable from another dataframe 

  
  #join the plant data
  full_join(planty, by = c("Site" = "Site", "Transect"))%>%
  
  #remove extra columns
  #dplyr::select( -c("top2_ratio", "Site_type"))%>%
  #dplyr::select( -c("top2_ratio", "Floral_shannon_index"   , "Floral_species_richness"))%>%
  
  #add only simpson index from relative_flower 
  full_join(Floral_simpson_index_T, by= c("Site", "Transect"))%>%

  
  #fill in NA with 0 in the Pastinaca.sativa"   "Daucus.carota"   "top2_ratio" columns   
  mutate(across(c("Pastinaca.sativa", "Daucus.carota"), ~replace_na(., 0)))%>%
  mutate(across(c("top2_ratio"), ~replace_na(., 0)))%>%

  
  #change the date to a date format
  mutate(Date = as.Date(Date, format = "%Y-%m-%d"))%>%
  
  #create Days_since_start variable
  mutate(Days_since_start = as.numeric(Date - min(Date)+1))%>%
  
  #rename rows in Site_type column 1-5y to young_restored and Reference to reference
  mutate(Site_type = ifelse(Site_type == "1-5 y", "young_restored", "reference"))%>%
  
  distinct()

III.A.4. Correlation matrix of environmental variables

At a transect level.

#correlation matrix of all the numerical environmental variables
tmp_cor_envir <- envir_data %>%
  select_if(is.numeric) %>%
  na.omit() %>%
  #remove minutes_since_9am column
  #select(-c("minutes_since_9am")) %>%
  mutate(across(everything(), scale))%>% # scale the data - Z-score standardization
  distinct()
  
tmp_qq_plots <- list()
# plot Q-Q plots for each variable
for (i in 1:ncol(tmp_cor_envir)) {
  tmp_qq_plots[[i]] <- ggplot(tmp_cor_envir, aes(sample = .data[[colnames(tmp_cor_envir)[i]]])) +
  stat_qq()+
  stat_qq_line()+
  ggtitle(str_wrap(colnames(tmp_cor_envir)[i], width = 20))+
  theme(plot.title = element_text(size = 10))  
}

tmp_qq_plots[[1]]; tmp_qq_plots[[2]]; tmp_qq_plots[[3]]; tmp_qq_plots[[4]]; tmp_qq_plots[[5]]; tmp_qq_plots[[6]]; tmp_qq_plots[[7]]; tmp_qq_plots[[8]]; tmp_qq_plots[[9]];tmp_qq_plots[[10]];tmp_qq_plots[[11]];tmp_qq_plots[[12]]; tmp_qq_plots[[13]];tmp_qq_plots[[14]]

#;tmp_qq_plots[[15]];tmp_qq_plots[[16]];tmp_qq_plots[[17]
tmp_env_cor_results <- rcorr(as.matrix(tmp_cor_envir), type = "spearman") 

# extract correlation matrix
tmp_env_cor_matrix <- tmp_env_cor_results$r

# extract the p-values
tmp_env_p_matrix <- tmp_env_cor_results$P

# plot correlation matrix (only insignificant ones hidden)
tmp_corr_num <- corrplot::corrplot(tmp_env_cor_matrix, 
         type = "upper",
         method = "color",
         diag = F,                         # remove diagonal
         #addCoef.col = "black",           # add coefficient coeffs, not used 
         p.mat = tmp_env_p_matrix,         # add p-values matrix
         sig.level = c(0.001, 0.01, 0.05), # significance thresholds
         tl.col = "black",                 # color of variable labels
         cl.align.text = "l",              # alignment of color legend
         cl.offset = 0.3,                  # offset color legend text to the right
         number.cex = 0.8,                 # size of coefficient text
         tl.cex = 0.8,                     # size of variable labels
         insig = "label_sig",              # add stars according to significance thresholds
         pch.cex = 0.8)                      # size of stars

# save the plot as a png
dev.copy(png, "C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/correlation_plot_num.png", width = 1600, height = 1600, res = 300)
## png 
##   3
dev.off()
## png 
##   2

III.A.5. PCA of environmental variables

#PCA of the environmental data
tmp_cor_envir_pca <- tmp_cor_envir %>%
  #remove Na values
  na.omit() %>%
  prcomp(, scale = TRUE)

# plot the PCA
(tmp_cor_envir_pca_plot <- fviz_pca_biplot(tmp_cor_envir_pca, 
                                         geom = c("point", "text"), 
                                         col.var = "contrib", # color by contributions to axes
                                         gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"), #this gradient indicates the contribution of each variable to the axis
                                         repel = TRUE, # avoid text overlapping
                                         title = "PCA of Environmental Data"))

# remove all dataframes with tmp_ prefix
rm(list = ls(pattern = "^tmp_"))
#rm(top2, landscape_data, dwd_weather, field_weather, start_net)

III.A.6. Landcover class vs Site Type

We are going to create barplots comparing the environmental variables between the two site types. First we’ll average the variables per site_type, then we’ll plot the average values per site type.

scale_fill_manual(values = c(“agri” = “#D6CFB7FF”, “grass” = “#6D8325FF”, “snh” = “#647D4BFF”, “forest” = “#121510FF”, “urban” = “#FA6900FF”, “water” = “#7FC7AFFF”))

#average per site type
envir_data_avg <- envir_data %>%
  group_by(Site_type) %>%
  #summarize across all numerical variables
  summarise(across(where(is.numeric), mean, na.rm = TRUE)) %>%
  ungroup()%>%
  #remove irrelevant columns
  dplyr::select(-c("Days_since_start", "minutes_since_9am", "dm_wind_velocity", "dm_temperature"))
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `across(where(is.numeric), mean, na.rm = TRUE)`.
## ℹ In group 1: `Site_type = "reference"`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))
#barplot of the average snh  per site type
(snh <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = snh)) +
  geom_bar(stat = "identity", fill = "#4B644BFF") +
  labs(title = "Average Surrounding Semi Natural Habitat Surface per Site Type",
       x = "Site type",
       y = "Average SNH") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

#barplot of the average water per site type
(water <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = water)) +
  geom_bar(stat = "identity", fill = "#A9CCE3") +
  labs(title = "Average Surrounding Water Surface per Site Type",
       x = "Site type",
       y = "Average Water") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

#barplot of the average agri per site type
(agri <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = agri)) +
  geom_bar(stat = "identity", fill = "#DABD61FF") +
  labs(title = "Average Surrounding Agricultural Surface per Site Type",
       x = "Site type",
       y = "Average Agri") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

#barplot of the average urban per site type
(urban <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = urban)) +
  geom_bar(stat = "identity", fill = "#803342FF") +
  labs(title = "Average Surrounding Urban Surface per Site Type",
       x = "Site type",
       y = "Average Urban") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

#barplot of the average forest per site type
(forest <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = forest)) +
  geom_bar(stat = "identity", fill = "#121510FF") +
  labs(title = "Average Surrounding Forest Surface per Site Type",
       x = "Site type",
       y = "Average Forest") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

#barplot of the average grassland per site type
(grassland <- 
  envir_data_avg %>%
  ggplot(aes(x = Site_type, y = grass)) +
  geom_bar(stat = "identity", fill = "#6D8325FF") +
  labs(title = "Average Surrounding Grassland Surface per Site Type",
       x = "Site type",
       y = "Average Grassland") +
  theme(legend.position = "left"))+
  #y limit
  ylim(0, 60)

envir_data_avg_long <- envir_data_avg %>%
  #keep only columns 1:7
  dplyr::select(1:7) %>%
  #long format to have stacked barplot
  pivot_longer(cols = c("agri", "grass", "snh", "forest", "urban", "water"), 
               names_to = "Landcover_class", 
               values_to = "Area_m2")

#stacked barplot of the average landcover per site type
(stack_land <- envir_data_avg_long %>%
  ggplot(aes(x = Site_type, y = Area_m2, fill = Landcover_class)) +
  geom_bar(stat = "identity", color="black", size= 0.2) +
  labs(title = "Average % per Landcover Class (1km Radius)",
       x = "Site type",
       y = "Average Area (m2)") +
  theme(legend.position = "right")+
  scale_fill_manual(
    values = c("agri" = "#DABD61FF", "grass" = "#A3A380", "snh" = "#4B644BFF", "forest" = "#121510FF", "urban" = "#803342FF", "water" = "#A9CCE3"),
    labels = c("agri" = "Agriculture", "grass" = "Grassland", "snh" = "Semi Natural Habitat", "forest" = "Forest", "urban" = "Urban", "water" = "Water"))+
    #legend title
  labs(fill = "Land cover class") +
  scale_x_discrete(labels = c(
  "young_restored" = "Young Restored",
  "reference" = "Reference")))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

#save stacked barplot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/stacked_barplot_landcover.png", 
       plot = stack_land, 
       width = 8, height = 4, dpi = 600)

    
  
#plot the data in a pie chart, facet per Site_type
(pie_land <- envir_data_avg_long %>%
  ggplot(aes(x = "", y = Area_m2, fill = Landcover_class)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar(theta = "y") +
  labs(title = "Average % per Landcover Class (1km Radius)",
       x = "",
       y = "") +
  theme_void() +
  facet_wrap(~ Site_type) +
  theme(legend.position = "right") +
  scale_fill_manual(values = c("agri" = "#DABD61FF", "grass" = "#A3A380", "snh" = "#4B644BFF", "forest" = "#121510FF", "urban" = "#803342FF", "water" = "#A9CCE3")))

III.A.7. Flower survey vs site type

#transform the plants dataframe into wide format to include all species at all sites with a 0 percentage 
# this is needed to have accurate percentage values for the average flower cover % per flower species per site type.  
avg_flower_long <- plants%>%
  dplyr::select(-BB)%>%
  #wide format, fill with 0
  pivot_wider(names_from = Plant_sp, values_from = Cover_percentage, values_fill = 0)%>%
  #long format
  pivot_longer(cols = 5:74, 
               names_to = "Plant_sp", 
               values_to = "Cover_percentage")


daucus <- avg_flower_long%>%
  #filter only daucus rows
  filter(grepl("Daucus", Plant_sp))%>%
  #average cover_percentage of daucus carota per site type 
  group_by(Site_type) %>%
  summarise(Cover_daucar_st = mean(Cover_percentage, na.rm = TRUE)) %>%
  ungroup()

#barplot of the average daucus cover per site type
(daucus_plot <- daucus %>%
  ggplot(aes(x = Site_type, y = Cover_daucar_st)) +
  geom_bar(stat = "identity", fill = "#A9CCE3") +
  labs(title = "Average Daucus Carota Cover % per Site Type",
       x = "Site type",
       y = "Average Daucus Carota Cover") +
  theme(legend.position = "left")+
  #y limit
  ylim(0, 100))

pastinaca <- avg_flower_long%>%
  #filter only daucus rows
  filter(grepl("Pastinaca sativa", Plant_sp))%>%
  #average cover_percentage of Pastinaca sativa per site type 
  group_by(Site_type) %>%
  summarise(Cover_passat_st = mean(Cover_percentage, na.rm = TRUE)) %>%
  ungroup()

#barplot of the average pastinaca cover per site type
(pastinaca_plot <- pastinaca%>%
  ggplot(aes(x = Site_type, y = Cover_passat_st)) +
  geom_bar(stat = "identity", fill = "#F5C542") +
  labs(title = "Average Pastinaca sativa Cover % per Site Type",
       x = "Site type",
       y = "Average Pastinaca sativa Cover") +
  theme(legend.position = "left")+
  #y limit
  ylim(0, 100))

#excluding passat and daucar
exc <- avg_flower_long%>%
  #filter everything but daucus and pastinaca
  filter(!grepl("Daucus|Pastinaca", Plant_sp))%>%
  #average cover_percentage of Pastinaca sativa per site type 
  group_by(Site_type, Plant_sp) %>%
  summarise(Cover_exc_st = mean(Cover_percentage, na.rm = TRUE)) %>%
  ungroup()%>%
  #sum up per site type
  group_by(Site_type) %>%
  summarise(Cover_exc_st = sum(Cover_exc_st, na.rm = TRUE)) %>%
  ungroup()
## `summarise()` has grouped output by 'Site_type'. You can override using the
## `.groups` argument.
avg1 <- exc %>%
  #join daucus
  full_join(daucus, by = "Site_type") %>%
  #join pastinaca
  full_join(pastinaca, by = "Site_type")%>%
  #long format for stacked barplot
  pivot_longer(cols = c("Cover_daucar_st", "Cover_passat_st", "Cover_exc_st"), 
               names_to = "Plant_sp", 
               values_to = "Cover_percentage")

#stacked bar plot 
(avg2 <- avg1 %>%
  ggplot(aes(x = Site_type, y = Cover_percentage, fill = factor(Plant_sp, levels= c("Cover_exc_st", "Cover_passat_st","Cover_daucar_st")))) +
  geom_bar(stat = "identity", position = "stack",  color = "black", size = 0.2) +
  labs(title = "Average Flower Cover Percentage",
       x = "Site type",
       y = "Average Flower Cover %") +
  theme(legend.position = "bottom") +
      # Stacked legend items
  guides(fill = guide_legend(ncol = 1)) + # Set number of columns to 1 to stack vertically
  scale_fill_manual(
  values = c(
    "Cover_exc_st" = "#A3A380",
    "Cover_daucar_st" = "#A9CCE3",
    "Cover_passat_st" = "#DABD61FF"
  ),
  labels = c(
    "Cover_exc_st" = "Cover % all other flower species",
    "Cover_passat_st" = "Cover % Pastinaca sativa",
    "Cover_daucar_st" = "Cover % Daucus carota"
  )) +
    #legend title
  labs(fill = "") +
  scale_x_discrete(labels = c(
  "1-5 y" = "Young Restored",
  "Reference" = "Reference"
))+
  #y limit
  ylim(0, 15))

#average flower cover % with all flower speceis to compare to the previous plot
avg_full <- avg_flower_long%>%
  #keep all flower species
  #average cover_percentage per species per site type 
  group_by(Site_type, Plant_sp) %>%
  summarise(Cover_full = mean(Cover_percentage, na.rm = TRUE)) %>%
  ungroup()%>%
  #sum up per site type
  group_by(Site_type) %>%
  summarise(Cover_full = sum(Cover_full, na.rm = TRUE)) %>%
  ungroup()
## `summarise()` has grouped output by 'Site_type'. You can override using the
## `.groups` argument.
# Do the total averages match ? Yes
sum(avg1$Cover_percentage[avg1$Site_type == "1-5 y"])==avg_full$Cover_full[avg_full$Site_type == "1-5 y"]
## [1] TRUE
sum(avg1$Cover_percentage[avg1$Site_type == "Reference"])==avg_full$Cover_full[avg_full$Site_type == "Reference"]
## [1] TRUE

III.B. Traditional methods - NETTING

III.B.1. Data preparation

#change lowest_taxa to Coccinellidae any row that has Coccinellidae_
netting <- netting %>%
  mutate(lowest_taxa = ifelse(grepl("Coccinellidae_", lowest_taxa), "Coccinellidae", lowest_taxa))

III.B.2. Data exploration

#barplot of the number of each taxon caught in the netting per site
netting %>%
  #group by site and taxa
  group_by(site_type, lowest_taxa) %>%
  #count the number of each taxon caught
  summarise(Count = n()) %>%
  ggplot(aes(x = site_type, y = Count, fill = lowest_taxa)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught during transect walk",
       x = "Site",
       y = "Number of individuals caught") +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  #scale_fill_manual(values = custom25) +  # Apply manually
  theme(legend.position = "none")
## `summarise()` has grouped output by 'site_type'. You can override using the
## `.groups` argument.

### III.B.2. Most visited flower species

# find out which plant species are the most visited by pollinators
top_plants <- netting %>%
  group_by(plant_species) %>%
  summarise(total_interactions = sum(total_interaction, na.rm = TRUE)) %>%
  arrange(desc(total_interactions)) %>%
  head(10)
  
# Add a column to define the color category
top_plants$fill_color <- ifelse(top_plants$plant_species == "Daucus carota", "#A9CCE3",
                          ifelse(top_plants$plant_species == "Pastinaca sativa", "#DABD61FF", "#A3A380"))

#Plot the top 10 plant species
(top_plants_plot <- ggplot(top_plants, aes(x = reorder(plant_species, total_interactions), 
                       y = total_interactions, 
                       fill = fill_color)) +
  geom_bar(stat = "identity", width = 1, color ="black", size= 0.2) +
  scale_fill_identity() +
  labs(title = "Transect walk: 10 Most Visited Flower Species",
       x = "Plant Species",
       y = "Total flower/pollinator interactions") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, face = "italic")))

#combine avg2 and top_plants_plot into one plot with a,b
library(cowplot)
(flower_visits <- cowplot::plot_grid(avg2 , top_plants_plot, labels = c("A", "B"),align = "h", axis = "tb", 
          rel_widths = c(1, 1), ncol = 2))

#save the plot 
# Save the plot with A4 dimensions (in cm) 
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/flower_visits_A4_cm.png", plot = flower_visits, 
       width = 21, height = 15, units = "cm", 
       dpi = 600, device = "png")

III.C. Traditional methods - PAN TRAPS

III.C.1. Data preparation

#the transect values for pan_family are not in the same format as in the envir_data. let's change that
#changing the str to character and then adding a T before the transect number
pan_family <- pan_family %>%
  mutate(Transect = as.character(Transect)) %>%
  mutate(Transect = paste0("T", Transect))

#combine envir_data with pan_family data
pan_family1 <- pan_family %>%
  
  #right date format
  mutate(Date = as.Date(Date, format = "%Y-%m-%d"))%>%
  
  #join the envir_data with pan_family
  left_join(envir_data, by = c("Site", "Transect", "Date", "Site_type"))%>%
  
  #remove extra columns
  dplyr::select(-minutes_since_9am, -Days_since_start)

III.C.2. Data exploration

# scatterplot the number of taxa caught in the pan traps per transect depending on simpson index
pan_family1 %>%
  #remove 0 count values
  filter(Count > 0) %>%
  ggplot(aes(x = Floral_simpson_index_T, y = Count)) +
  geom_point(aes(color = Site), alpha = 5) +
  #geom_smooth(method = "lm", se = FALSE) +
  labs(title = "Number of individuals caught in pan traps per transect depending on simpson index",
       x = "Floral simpson index",
       y = "Number of individuals caught") +
  scale_color_manual(values = saturated_pal) +
  theme(legend.position = "left")

#stacked  barplot of the number of individuals caught in the pan traps per transect per site
pan_family1 %>%
  ggplot(aes(x = Site, y = Count, fill = Transect)) +
  geom_bar(stat = "identity") +
  labs(title = "Number of individuals caught in pan traps per transect per site",
       x = "Site",
       y = "Number of individuals caught") +
  #scale_fill_manual(values = saturated_pal) +
  theme(legend.position = "left")

custom25 <- paletteer_c("grDevices::Hawaii", n = 25) # Extract first 25 colors

#custom25 <- c(paletteer_d("khroma::light"),paletteer_d("ggprism::muted_rainbow"),paletteer_d("ggsci::default_flatui"))[1:25]  # Ensure exactly 25 colors


#barplot of the number of each taxon caught in the pan traps per site type
pan_family1 %>%
  #group by site and taxa
  group_by(Site, Site_type,Taxa) %>%
  #sum the count of each taxon
  summarise(Count = sum(Count)) %>%
  ggplot(aes(x = Site_type, y = Count, fill = Taxa)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught in pan traps per site",
       x = "Site",
       y = "Number of individuals caught") +
  #facet_wrap(~ Site_type) +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  scale_fill_manual(values = custom25) +  # Apply manually
  theme(legend.position = "right")
## `summarise()` has grouped output by 'Site', 'Site_type'. You can override using
## the `.groups` argument.

#barplot of the number of each taxon caught in the pan traps per site
pan_family1 %>%
  #group by site and taxa
  group_by(Site, Site_type,Taxa) %>%
  #sum the count of each taxon
  summarise(Count = sum(Count)) %>%
  ggplot(aes(x = Site, y = Count, fill = Taxa)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught in pan traps per site",
       x = "Site",
       y = "Number of individuals caught") +
  #facet_wrap(~ Site_type) +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  scale_fill_manual(values = custom25) +  # Apply manually
  theme(legend.position = "right")
## `summarise()` has grouped output by 'Site', 'Site_type'. You can override using
## the `.groups` argument.

#plot of count vs floral simpson index
pan_family1 %>%
  #remove 0 count values
  #filter(Count > 0) %>%
  ggplot(aes(x =Floral_simpson_index_T, y = Count)) +
  geom_point(aes(color = Site), alpha = 5) +
  #geom_smooth(method = "lm", se = FALSE) +
  labs(title = "Count vs Floral Simpson Index",
       x = "Floral Simpson Index",
       y = "Count") +
  scale_color_manual(values = saturated_pal) +
  theme(legend.position = "bottom")

pan_family1 %>%
  #remove 0 count values
  filter(Count > 0) %>%
  #remove count higher than 30
  #filter(Count < 30) %>%
  ggplot(aes(x = Site_type, y = Count)) +
  geom_boxplot(fill = "lightblue") +
  theme_minimal() +
  labs(y = "Total interactions", x = "Site Type")+
  theme(legend.position = "bottom")

III.C.3. Pan traps vs InsectDetect

#barplot of the number of each taxon caught with the platform camera per site
platform_camera %>%
  #filter out top1 that start with none_ as they are not real taxa
  filter(!grepl("none_", top1)) %>%
  #remove top1_prob_weighted that are below 0.5
  #filter(top1_prob_weighted > 0.5) %>%
  #group by site and taxa
  group_by(location, top1) %>%
  #count the number of each top1 caught
  summarise(Count = n()) %>%
  ggplot(aes(x = location, y = Count, fill = top1)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught on platforms per site",
       x = "Site",
       y = "Number of individuals caught") +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  scale_fill_manual(values = custom25) +  # Apply manually
  theme(legend.position = "left")
## `summarise()` has grouped output by 'location'. You can override using the
## `.groups` argument.

# unify the taxa names in both datasets
tmp_pan_family <- pan_family%>%
 # adapt taxa categories to the ones used in platform data (less precise)
  mutate(Category = case_when(
    Taxa %in% c("cantharidae", "carabidae", "curculionidae", "elateridae",
                "mordellidae", "staphylinidae") ~ "beetle",
    Taxa %in% c("coccinellidae") ~ "beetle_cocci",
    Taxa %in% c("hemiptera") ~ "bug",
    Taxa %in% c("diptera","calliphoridae","cecidomyiidae","tachinidae","calliphoridae",
                "sepsidae","ephydridae","muscidae","asilidae","stratiomyidae", "polleniidae","acalyptrate" ) ~ "fly",
    Taxa %in% c("dasypoda","apidae","colletidae") ~ "bee_apis",
    Taxa %in% c("bombus") ~ "bee_bombus",
    Taxa %in% c("sarcophagidae") ~ "fly_sarco",
    Taxa %in% c("symphyta","apocrita","proctotrupidae","tenthredinidae") ~ "wasp",
    Taxa %in% c("empididae") ~ "fly_empi",
    TRUE ~ Taxa
    ))%>%
    #remove all rows that have a 0 "count" value
  filter(Count != 0) %>%
  
  #select only the relevant columns for the comparison
  dplyr::select(Site, Transect, Site_type, Category, Count)

#select only the relevant columns fo platform_camera for the comparison
tmp_platform_camera <- platform_camera %>%
  
  mutate(transect = as.character(transect)) %>%
  mutate(transect = paste0("T", transect))%>%
  
  #filter out top1 that start with none_ as they are not real taxa
  filter(!grepl("none_", top1)) %>%
  
  #remove top1_prob_weighted that are below 0.5
  filter(top1_prob_weighted > 0.5) %>%
  
  #group by site and taxa
  group_by(location,transect, top1) %>%
  
  #count the number of each top1 caught
  summarise(Count = n()) %>%
  ungroup() %>%
  
  #add empty Site_Type column
  mutate(Site_type = NA)%>%
  
  #remove columns that not needed for comparison with pan_family, rename top1 to Category
  dplyr::select(Site=location, Site_type,Transect=transect, Category = top1, Count)
## `summarise()` has grouped output by 'location', 'transect'. You can override
## using the `.groups` argument.
#add method column to both pan_family and platform_camera
tmp_pan_family$Method <- "bowl_trap"
tmp_platform_camera$Method <- "platform_camera"

#combine the two datasets
tmp_combined <- rbind(tmp_pan_family, tmp_platform_camera)%>%
  #adapt empty Site_type following the Site column
  mutate(Site_type = ifelse(Site %in% c("DES", "HLI", "JEP", "STP", "WUP"), "reference", "young_restored"))

tmp_combined %>%
  mutate(
    Method = fct_recode(Method,
                        "Pan Trap" = "bowl_trap",
                        "Platform Camera" = "platform_camera"),
    Site_type = fct_recode(Site_type,
                           "Reference Site" = "reference",
                           "Young Restored Site" = "young_restored")
  ) %>%
  ggplot(aes(x = Site, y = Count, fill = Category)) +
  geom_bar(stat = "identity", position = "stack", color = "black", size= 0.2) +
  facet_wrap(~Method + Site_type, ncol = 2, scales = "free_x") +  # free_x allows each facet to drop empty Site slots 
  scale_x_discrete(drop = TRUE) +  # ensure unused levels are dropped
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Comparison of the two sampling methods",
       x = "Site",
       y = "Count")+
    scale_fill_manual(values = custom25)

  # Add confidence threshold annotation
  #annotate("text", x = Inf, y = Inf, label = paste("Confidence threshold:", 0.85), hjust = 1.1, vjust = 1.1, size = 3)

tmp_combined %>%
  mutate(
    Method = fct_recode(Method,
                        "Pan Trap" = "bowl_trap",
                        "Platform Camera" = "platform_camera"),
    Site_type = fct_recode(Site_type,
                           "Reference Site" = "reference",
                           "Young Restored Site" = "young_restored")
  ) %>%
  filter(Count > 0) %>%
  ggplot(aes(x = Site, y = Count, fill = Category)) +
  geom_bar(stat = "identity", position = "stack", color = "black", size= 0.2) +
  facet_grid(Method ~ Site_type,scales = "free_x", space = "free_x") +  # grid layout!
  scale_x_discrete(drop = TRUE) +
  #theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(#title = "Comparison of the Two Sampling Methods",
       x = "Site",
       y = "Count") +
  scale_fill_manual(values = custom25)

#save the plot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/comparison_pan_platform.png", 
       plot = last_plot(), 
       width = 8, height = 6, dpi = 600)

sum(tmp_combined$Count[tmp_combined$Method == "bowl_trap"])
## [1] 101
sum(tmp_combined$Count[tmp_combined$Method == "platform_camera"])
## [1] 696
library(ggplot2)
library(dplyr)
library(forcats)
library(patchwork)




# Common data prep
tmp_combined2 <- tmp_combined %>%
  mutate(
    Method = fct_recode(Method,
                        "Pan Trap" = "bowl_trap",
                        "Platform Camera" = "platform_camera"),
    Site_type = fct_recode(Site_type,
                           "Reference Site" = "reference",
                           "Young Restored Site" = "young_restored")
  ) %>%
  filter(Count > 0)

#color palette
categories <-  sort(unique(tmp_combined2$Category))

# Get 18 colors from your palette (same length as categories)
custom18 <- paletteer::paletteer_c("grDevices::Viridis", n = length(categories))

# Create the named color vector
custom18 <- setNames(custom18, categories)

# Base theme with horizontal site labels
base_theme <- theme(
  axis.text.x = element_text(angle = 0, hjust = 0.5),
  legend.position = "right"  # legend stays at bottom for each plot
)

# Plot 1: Pan Trap with y limit
p1 <- tmp_combined2 %>%
  filter(Method == "Pan Trap") %>%
  ggplot(aes(x = Site, y = Count, fill = Category)) +
  geom_bar(stat = "identity", position = "stack", color = "black", size = 0.2) +
  facet_grid(. ~ Site_type, scales = "free_x", space = "free_x") +
  scale_x_discrete(drop = TRUE) +
  labs(
    title = "Pan Trap",
    x = "Site",
    y = "Count"
  ) +
  scale_fill_manual(values = custom18) +
  ylim(0, 45) +
  guides(fill = guide_legend(ncol = 2))+
  base_theme

# Plot 2: Platform Camera (free y scale)
p2 <- tmp_combined2 %>%
  filter(Method == "Platform Camera") %>%
  ggplot(aes(x = Site, y = Count, fill = Category)) +
  geom_bar(stat = "identity", position = "stack", color = "black", size = 0.2) +
  facet_grid(. ~ Site_type, scales = "free_x", space = "free_x") +
  scale_x_discrete(drop = TRUE) +
  labs(
    title = "Platform Camera",
    x = "Site",
    y = "Count"
  ) +
  scale_fill_manual(values = custom18) +
  guides(fill = guide_legend(ncol = 2))+
  base_theme

# Combine plots (each with its own legend)
p1 / p2  # ⬅️ No "collect" = separate legends!

# Save the combined plot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/comparison_pan_platform_TH0.5.png", 
       plot = last_plot(), 
       width = 8, height = 6, dpi = 600)
library(shiny)
library(ggplot2)
library(dplyr)

# Define UI
ui <- fluidPage(
  titlePanel("Interactive Threshold Adjustment"),
  sidebarLayout(
    sidebarPanel(
      sliderInput("threshold", "Select top1_prob_weighted threshold:", 
                  min = 0.4, max = 1.0, value = 0.85, step = 0.01)
    ),
    mainPanel(
      plotOutput("barPlot"),
      verbatimTextOutput("counts")
    )
  )
)

# Define server logic
server <- function(input, output) {
  
  # Reactive dataset based on threshold
  filtered_data <- reactive({
    tmp_platform_camera <- platform_camera %>%
      mutate(transect = as.character(transect)) %>%
      mutate(transect = paste0("T", transect)) %>%
      filter(!grepl("none_", top1)) %>%
      filter(top1_prob_weighted > input$threshold) %>%
      group_by(location, transect, top1) %>%
      summarise(Count = n(), .groups = "drop") %>%
      mutate(Site_type = NA) %>%
      dplyr::select(Site = location, Site_type, Transect = transect, Category = top1, Count)
    
    tmp_pan_family$Method <- "bowl_trap"
    tmp_platform_camera$Method <- "platform_camera"
    
    tmp_combined <- rbind(tmp_pan_family, tmp_platform_camera) %>%
      mutate(Site_type = ifelse(Site %in% c("DES", "HLI", "JEP", "STP", "WUP"), 
                                "reference", "young_restored"))
    
    return(tmp_combined)
  })
  
  # Render plot
  output$barPlot <- renderPlot({
    ggplot(filtered_data(), aes(x = Site, y = Count, fill = Category)) +
      geom_bar(stat = "identity", position = "stack") +
      facet_wrap(~Method) +
      theme_minimal() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
      labs(title = "Comparison of the two sampling methods",
           x = "Site",
           y = "Count") +
      scale_fill_manual(values = custom25)
  })
  
  # Show sum of counts for each method
  output$counts <- renderText({
    data <- filtered_data()
    bowl_count <- sum(data$Count[data$Method == "bowl_trap"], na.rm = TRUE)
    platform_count <- sum(data$Count[data$Method == "platform_camera"], na.rm = TRUE)
    paste("Total Count (Bowl Trap):", bowl_count, "\nTotal Count (Platform Camera):", platform_count)
  })
}

# Run the application 
shinyApp(ui = ui, server = server)
Shiny applications not supported in static R Markdown documents
#remove all dataframes with tmp_ prefix
rm(list = ls(pattern = "^tmp_"))

III.D. Automated methods - PLATFORM CAMERAS

III.D.1. Data preparation

str(platform_camera)
## 'data.frame':    2096 obs. of  13 variables:
##  $ Site_Tn           : chr  "WED_3" "WED_3" "WED_3" "WED_3" ...
##  $ location          : chr  "WED" "WED" "WED" "WED" ...
##  $ transect          : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ date              : chr  "2024-07-11" "2024-07-11" "2024-07-11" "2024-07-11" ...
##  $ ID                : chr  "cam36_4" "cam36_1091" "cam36_2044" "cam36_2045" ...
##  $ cam_ID            : chr  "seppi-cam36" "seppi-cam36" "seppi-cam36" "seppi-cam36" ...
##  $ start_time        : chr  "2024-07-11T13:18:12Z" "2024-07-11T13:38:07Z" "2024-07-11T14:39:33Z" "2024-07-11T14:39:54Z" ...
##  $ top1              : chr  "none_bg" "fly_sarco" "fly_small" "fly_small" ...
##  $ det_conf_mean     : num  0.8 0.92 0.65 0.54 0.7 0.69 0.87 0.63 0.88 0.58 ...
##  $ track_ID_imgs     : int  3 20 5 14 28 19 15 5 12 13 ...
##  $ top1_imgs         : int  2 15 5 13 28 19 11 5 12 9 ...
##  $ top1_prob_mean    : num  0.84 0.51 0.91 0.88 0.89 0.87 0.68 0.82 0.88 0.52 ...
##  $ top1_prob_weighted: num  0.56 0.38 0.91 0.82 0.89 0.87 0.5 0.82 0.88 0.36 ...
platform_camera <- platform_camera %>%
  #transect values from numbers 1-5 to T1-T5
  mutate(transect = as.character(transect)) %>%
  mutate(transect = paste0("T", transect))%>%
  #filter out top1 that start with none_ as they are not real taxa
  filter(!grepl("none_", top1))

#format date

platform_camera1 <- platform_camera %>%
  #right date format
  mutate(date = as.Date(date, format = "%Y-%m-%d"))%>%
  #join the envir_data with platform_camera
  left_join(envir_data, by = c("location"="Site","transect" ="Transect", "date"="Date"))%>%
  #remove extra columns
  dplyr::select(-minutes_since_9am)%>%
  
  #add column with recording time per transect from platform_logs_rec
  left_join(platform_logs_rec, by = c("location"="Site", "transect"="Transect", "cam_ID"))%>%
  
    #add minutes_since_9am column by mutating first_record_start_time from 2024-07-11T13:38:07Z to an hour (13) and minute (38) columns
  #mutate(first_record_start_time = gsub("T", " ", first_record_start_time)) %>%  # Replace 'T' with space
  #mutate(first_record_start_time = gsub("Z", "", first_record_start_time)) %>%  # Remove 'Z'
  #remove the seconds from the time
  #mutate(first_record_start_time = gsub(":(\\d{2})$", "", first_record_start_time)) %>%  # Remove seconds
  #remove the "yyyy-mm-dd " from the first_record_start_time
  #mutate(first_record_start_time = gsub(".* ", "", first_record_start_time)) %>%  # Remove date part
  #split the first_record_start_time into hour and minute columns
  mutate(Hour = as.integer(substr(first_record_start_time, 1, 2)),  # Extract hour
         Minute = as.integer(substr(first_record_start_time, 4, 5)))%>% # Extract minutes
  #add new column minutes since 9 am
  mutate(minutes_since_9am = (Hour - 9) * 60 + Minute)%>%
  #keep only first value of minutes_since_9am when site and transect are the same
  group_by(location, transect, date) %>%
  mutate(minutes_since_9am = first(minutes_since_9am)) %>%
  ungroup() %>%
  #remove unnecessary columns
  dplyr::select(-c("first_record_start_time", "Hour", "Minute"))#%>%

#barplot start time per site, colored by transect
platform_camera1 %>%
    ggplot(aes(x = location, y = minutes_since_9am, fill = transect)) +
  geom_bar(stat = "identity", position= "dodge") +
  labs(title = "Start time of transect walk per site",
       x = "Site",
       y = "Time (minutes since 9 am)") +
  theme(legend.position = "left")+
  #viridis
  scale_fill_viridis_d() 

platform_camera1 <- platform_camera1 %>%
 #scale these columns: "dm_wind_velocity" "dm_temperature" "agri" "grass" "snh" "forest" "urban" "water" "Pastinaca.sativa" "Daucus.carota" "top2_ratio" "Floral_simpson_index" "minutes_since_9am"  
  mutate(across(c("dm_wind_velocity", "dm_temperature", "agri", "grass", "snh", "forest", "urban", "water", "Pastinaca.sativa", "Daucus.carota", "top2_ratio", "Floral_simpson_index_T", "minutes_since_9am", "Days_since_start"), scale))

III.D.2. Data exploration

#barplot of the number of each taxon caught with the platform camera per site
platform_camera1 %>%
  #filter out top1 that start with none_ as they are not real taxa
  filter(!grepl("none_", top1)) %>%
  #remove top1_prob_weighted that are below 0.5
  filter(top1_prob_weighted > 0.5) %>%
  #group by site and taxa
  group_by(location, Site_type, top1) %>%
  #count the number of each top1 caught
  summarise(Count = n()) %>%
  ggplot(aes(x = location, y = Count, fill = top1)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught on platforms per site",
       x = "Site",
       y = "Number of individuals caught") +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  scale_fill_manual(values = custom25) +  # Apply manually
  theme(legend.position = "right")
## `summarise()` has grouped output by 'location', 'Site_type'. You can override
## using the `.groups` argument.

#barplot of the number of each taxon caught with the platform camera per site_type
platform_camera1 %>%
  #filter out top1 that start with none_ as they are not real taxa
  filter(!grepl("none_", top1)) %>%
  #remove top1_prob_weighted that are below 0.5
  filter(top1_prob_weighted > 0.5) %>%
  #group by site and taxa
  group_by(location, Site_type, top1) %>%
  #count the number of each top1 caught, and divide by 5 if it's a reference site, and by 4 if it's a young restored site
  
  summarise(Count = n()) %>%
  ggplot(aes(x = Site_type, y = Count, fill = top1)) +
  geom_bar(stat = "identity", color = "black") +
  labs(title = "Count of each taxon caught on platforms per site type",
       x = "Site type",
       y = "Number of individuals caught") +
  #viridis  color palette for the taxa 
  #scale_fill_viridis_d(option = "rocket") +
  scale_fill_manual(values = custom25, name= "Taxa") +  # Apply manually
  theme(legend.position = "right")
## `summarise()` has grouped output by 'location', 'Site_type'. You can override
## using the `.groups` argument.

III.E. Automated methods - FLOWER CAMERAS

III.E.1. Data preparation

#change format of "time" column from hh-mm-ss to hh:mm:ss
flower_camera <- flower_camera %>%
  mutate(time = gsub("\\-", ":", time)) %>%  # Replace '-' with ':'
  mutate(time = chron::times(time))%>%  # Convert to time-only format
  #change format of "date" 
  mutate(date = as.Date(date, format = "%Y-%m-%d"))

str(flower_camera)
## 'data.frame':    56173 obs. of  10 variables:
##  $ Image_Path             : chr  "C:/Users/Almas/Ultralytics/runs/detect/20250225_predict_seppi-cam31/crops/arthropod\\2024-07-11_12-09-30-664486_full.jpg" "C:/Users/Almas/Ultralytics/runs/detect/20250225_predict_seppi-cam31/crops/arthropod\\2024-07-11_12-09-40-664478_full.jpg" "C:/Users/Almas/Ultralytics/runs/detect/20250225_predict_seppi-cam31/crops/arthropod\\2024-07-11_12-10-00-664426_full.jpg" "C:/Users/Almas/Ultralytics/runs/detect/20250225_predict_seppi-cam31/crops/arthropod\\2024-07-11_12-10-20-664420_full.jpg" ...
##  $ Order                  : chr  "Coleoptera" "Lepidoptera" "Lepidoptera" "Hymenoptera" ...
##  $ Family                 : chr  "Syrphidae" "Fabaceae" "Vitaceae" "Apidae" ...
##  $ Family_Confidence      : num  0.2925 0.1594 0.0963 0.9024 0.956 ...
##  $ Classification_Category: chr  "Syrphidae" "other_families" "other_families" "Apidae" ...
##  $ cam                    : chr  "cam31" "cam31" "cam31" "cam31" ...
##  $ date                   : Date, format: "2024-07-11" "2024-07-11" ...
##  $ time                   : 'times' num  12:09:30 12:09:40 12:10:00 12:10:20 12:10:25 ...
##   ..- attr(*, "format")= chr "h:m:s"
##  $ site                   : chr  "WED" "WED" "WED" "WED" ...
##  $ flower_sp              : chr  "Knautia arvensis" "Knautia arvensis" "Knautia arvensis" "Knautia arvensis" ...
custom193 <- paletteer_c("ggthemes::Sunset-Sunrise Diverging", n = 193)

# Replace "other_families" with gray
custom193_named <- setNames(custom193, sort(unique(flower_camera$Classification_Category)))
custom193_named["other_families"] <- "gray50"  # Adjust shade if needed

#plot of the number of each taxon caught in the flower cameras per site
flower_camera %>%
  #summarise per site and count the number of each taxon caught
  group_by(site, Classification_Category) %>%
  summarise(Count = n(), .groups = "drop") %>%
  
  ggplot(aes(x = site, y = Count, fill = Classification_Category)) +
  geom_bar(stat = "identity") +
  labs(title = "Count of each taxon caught with flower cameras per site",
       x = "Site",
       y = "Number of detections") +
  theme(legend.position = "none") +
  scale_fill_manual(values = custom193_named)   # Apply manually

#remove rows with Classification_Category "other_families"
flower_camera <- flower_camera %>%
 #filter out the rows where Classification_Category is "other_families"
  filter(Classification_Category != "other_families")
  
  
## Define THRESHOLD ----
TH <- 0.5

#histogram of the Family_Confidence
flower_camera %>%
  ggplot(aes(x = Family_Confidence)) +
  geom_histogram(binwidth = 0.05, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Family Confidence",
       x = "Family Confidence",
       y = "Count")+
  #add a vertical line at the threshold
  geom_vline(xintercept = TH, color = "red", linetype = "dashed")+  
  # Add confidence threshold annotation
  annotate("text", x = Inf, y = Inf, label = paste("Confidence threshold:", TH), 
           hjust = 1.1, vjust = 1.1, size = 3, color = "red")

#new dataframe above TH
flower_camera_light <- flower_camera %>%
  #remove path column
  dplyr::select(-Image_Path) %>%
  #filter out the rows where Family_Confidence is below the threshold
  filter(Family_Confidence > TH) %>% 
  #fill Site_type column with reference or young_restored
  mutate(Site_type = ifelse(site %in% c("DES", "HLI", "JEP", "STP", "WUP"), "reference", "young_restored"))
  
#calculate the proportion of rows removed
diff <- nrow(flower_camera) - nrow(flower_camera_light)
prop <- diff / nrow(flower_camera) 

#print 
print(paste("By removing rows with a probality score below the threshold", TH, ", we remove", round(prop,2), "of the data."))
## [1] "By removing rows with a probality score below the threshold 0.5 , we remove 0.65 of the data."
#remove 
rm(diff, prop)

#plot of the number of each taxon caught in the flower cameras per site
flower_camera_light %>%
  #summarise per site and count the number of each taxon caught
  group_by(site, Classification_Category) %>%
  summarise(Count = n(), .groups = "drop") %>%
  
  ggplot(aes(x = site, y = Count, fill = Classification_Category)) +
  geom_bar(stat = "identity") +
  labs(title = "Count of each taxon caught in flower cameras per site",
       x = "Site",
       y = "Number of individuals caught") +
  theme(legend.position = "none") +
  scale_fill_manual(values = custom193_named)  +  
  # Add confidence threshold annotation
  annotate("text", x = Inf, y = Inf, label = paste("Probability threshold:", TH), 
           hjust = 1.1, vjust = 1.1, size = 3)

#plot of the number of each taxon caught in per flower species 
flower_camera_light %>%
  #summarise per site and count the number of each taxon caught
  group_by(site, Classification_Category, flower_sp) %>%
  summarise(Count = n(), .groups = "drop") %>%
  
  ggplot(aes(x = flower_sp, y = Count, fill = Classification_Category)) +
  geom_bar(stat = "identity") +
  labs(title = "Count of each taxon caught in flower cameras per flower species",
       x = "Flower Species",
       y = "Number of detections") +
  theme(legend.position = "none") +
  #slant the x-axis labels
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_fill_manual(values = custom193_named) +  
  # Add confidence threshold annotation
  annotate("text", x = Inf, y = Inf, label = paste("Probability threshold:", TH), 
           hjust = 1.1, vjust = 1.1, size = 3)

III.E.2. Data exploration - Individuals vs time

#plot of the number of individuals caught in the flower cameras over time
# Convert time to POSIXct by combining with the date
binned_data <- flower_camera_light %>%
  mutate(
    datetime = as.POSIXct(paste(date, time), format = "%Y-%m-%d %H:%M:%S"),  # Ensure proper datetime format
    time_bin = floor_date(datetime, "10 minutes"))%>%  # Bin into 5-minute chunks
  #remove date from the time_bin column
  mutate(time_bin = format(time_bin, "%H:%M:%S"))

# Summarize count of individuals per order per time bin
binned_data <- binned_data %>%
  group_by(date,site,time_bin, Family) %>%
  summarise(Count = n(), .groups = "drop")%>%
  #change time_bin to a character
  mutate(time_bin = as.character(time_bin))

# Plot the data
ggplot(binned_data, aes(x = time_bin, y = Count, fill = Family)) +
  geom_bar(stat = "identity")+
  labs(
    title = "Number of Individuals Per Family Over Time (10-min Bins)",
    x = "Time (10-minute bins)",
    y = "Count of Individuals"
  ) +
  facet_wrap(~site) +
  theme_minimal() +
  theme(legend.position = "none")+
  # format x-axis to show labels every hour
  scale_x_discrete(
      breaks = binned_data$time_bin[grepl("00:00$|30:00$", binned_data$time_bin)],
      labels = gsub(":00$", "", binned_data$time_bin[grepl("00:00$|30:00$", binned_data$time_bin)]))+
  #slant the x-axis labels
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) 

  #scale_color_manual(values = custom193_named)   # Apply manually

#rm(binned_data)

#linear model of the number of individuals caught in the flower cameras over time

III.E.3. Data exploration - Daucus carota

Here, we’ll focus on the Daucus carota species, as it is a common plant species in the study area and is known to attract a variety of pollinators.

Since the cameras didn’t have an ID tracking system, we’ll have a look at taxonomic richness and relative abundance of pollinators attracted to Daucus carota.

daucus_cam <- flower_camera_light %>%
  #filter out the rows where the flower species is Daucus.carota
  filter(flower_sp == "Daucus carota") %>%
  group_by(site) %>%
  summarise(unique_families = unique(Classification_Category),
            .groups = "drop")%>%
  #add Count column filled with 1 
  mutate(Count = 1)
## Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
## dplyr 1.1.0.
## ℹ Please use `reframe()` instead.
## ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
##   always returns an ungrouped data frame and adjust accordingly.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
#same thing with netting
daucus_net <- netting %>%
  #filter out the rows where the flower species is Daucus.carota
  filter(plant_species == "Daucus carota") %>%
  group_by(site) %>%
  summarise(unique_families = unique(family),
            .groups = "drop")%>%
  #add Count column filled with 1 
  mutate(Count = 1)
## Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
## dplyr 1.1.0.
## ℹ Please use `reframe()` instead.
## ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
##   always returns an ungrouped data frame and adjust accordingly.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Combine both datasets, adding a column to differentiate methods
daucus <- bind_rows(
  daucus_cam %>% mutate(method = "Camera"),
  daucus_net %>% mutate(method = "Net")
)

#combining all the unique families caught on Daucus with nets and cameras
fams <- sort(unique(daucus$unique_families));length(fams) # Get unique families sorted alphabetically
## [1] 90
customfam <- paletteer_c("pals::kovesi.rainbow_bgyrm_35_85_c69", n = length(fams))
customfam <- setNames(customfam, fams)

# Create a single ggplot with faceting
final_plot1 <- ggplot(daucus, aes(x = site, y = Count, fill = unique_families)) +
  geom_bar(stat = "identity") +
  facet_wrap(~method, ncol = 1) +  # Separate plots
  labs(title = "Pollinator Family richness caught on Daucus carota",
       x = "Site",
       y = "Number of unique families caught") +
  theme(legend.position = "none") +  # Shared legend at bottom
  scale_fill_manual(values = customfam)  # Consistent colors

# Create a single ggplot with faceting
final_plot2 <- ggplot(daucus, aes(x = method, y = Count, fill = unique_families)) +
  geom_bar(stat = "identity") +
  labs(title = "Pollinator Family richness caught on Daucus carota",
       x = "Site",
       y = "Number of unique families caught") +
  theme(legend.position = "none") +  # Shared legend at bottom
  scale_fill_manual(values = customfam)  # Consistent colors

# Print the final plot
print(final_plot1); print(final_plot2)

rm(daucus_cam, daucus_net, fams, customfam, final_plot1, final_plot2)

IV. Modelling

Key Metric Meaning

IV.A. Flower survey data

For the flower survey data, we have data at the plot level, but we’ll average them to transect level, as it is the same level as the netting, pan traps and platform data.

IV.A.1. Flower survey - Linear mixed model with random effect - lmer

scaled_envir_data <-envir_data %>%
  
  #z transform the numerical columns, except Floral_simpson_index
  mutate(across(where(is.numeric) & !contains("Floral_simpson_index"), scale))

#checking the values
scaled_envir_data%>%
  mutate(across(where(is.numeric)))%>%
  #summary()%>%
  sapply(sd)
## Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
## na.rm): NAs introduced by coercion
## Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
## na.rm): NAs introduced by coercion
## Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
## na.rm): NAs introduced by coercion
## Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
## na.rm): NAs introduced by coercion
##                   Date                   Site               Transect 
##              9.4641802                     NA                     NA 
##      minutes_since_9am       dm_wind_velocity         dm_temperature 
##              1.0000000              1.0000000              1.0000000 
##                   agri                  grass                    snh 
##              1.0000000              1.0000000              1.0000000 
##                 forest                  urban                  water 
##              1.0000000              1.0000000              1.0000000 
##         majority_class       Pastinaca.sativa          Daucus.carota 
##                     NA              1.0000000              1.0000000 
##             top2_ratio   average_flower_cover           Plot_Cover_T 
##              1.0000000              1.0000000              1.0000000 
##              Site_type Floral_simpson_index_T       Days_since_start 
##                     NA              0.1130745              1.0000000
scaled_envir_data%>%
  ggplot(aes(x=Floral_simpson_index_T))+
  geom_histogram(binwidth = 0.1, fill = "lightblue", color = "black")+
  labs(title = "Histogram of Floral Simpson Index",
       x = "Floral Simpson Index",
       y = "Count")

# is the floral simpson index normally distributed?
shapiro.test(scaled_envir_data$Floral_simpson_index_T) # p-value = 0.02828, so it is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  scaled_envir_data$Floral_simpson_index_T
## W = 0.94322, p-value = 0.02828
datawizard::describe_distribution(scaled_envir_data$Floral_simpson_index_T)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 0.19 | 0.11 | 0.14 | [0.04, 0.48] |     0.74 |     0.10 | 45 |         0
# Create a linear model
# full model with Floral_simpson_index as response variable and structure around  the sites as explanatory variables, and site as random effect
plantsimp_mod1_full <- lmer(Floral_simpson_index_T
                            ~ Site_type
                            + majority_class 
                            + Plot_Cover_T
                            #+ agri + grass + snh + forest + urban + water
                            + (1|Site), data = scaled_envir_data)
## boundary (singular) fit: see help('isSingular')
summary(plantsimp_mod1_full)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Floral_simpson_index_T ~ Site_type + majority_class + Plot_Cover_T +  
##     (1 | Site)
##    Data: scaled_envir_data
## 
## REML criterion at convergence: -67.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3955 -0.5241 -0.1653  0.4644  2.4832 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Site     (Intercept) 0.000000 0.00000 
##  Residual             0.007099 0.08425 
## Number of obs: 45, groups:  Site, 9
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)              0.21712    0.03367   6.448
## Site_typeyoung_restored -0.04386    0.03604  -1.217
## majority_classforest     0.01206    0.04489   0.269
## majority_classgrass     -0.01266    0.03444  -0.368
## majority_classurban     -0.01004    0.04573  -0.219
## Plot_Cover_T            -0.07821    0.01468  -5.329
## 
## Correlation of Fixed Effects:
##             (Intr) St_ty_ mjrty_clssf mjrty_clssg mjrty_clssr
## St_typyng_r -0.758                                           
## mjrty_clssf -0.798  0.618                                    
## mjrty_clssg -0.742  0.422  0.594                             
## mjrty_clssr -0.074 -0.299  0.033       0.163                 
## Plot_Covr_T -0.406  0.425  0.422       0.320      -0.196     
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(plantsimp_mod1_full)
## Cannot compute standard errors and confidence intervals for random
##   effects parameters.
##   Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(37) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        0.22 | 0.03 | [ 0.15,  0.29] |  6.45 | < .001
## Site type [young_restored] |       -0.04 | 0.04 | [-0.12,  0.03] | -1.22 | 0.231 
## majority class [forest]    |        0.01 | 0.04 | [-0.08,  0.10] |  0.27 | 0.790 
## majority class [grass]     |       -0.01 | 0.03 | [-0.08,  0.06] | -0.37 | 0.715 
## majority class [urban]     |       -0.01 | 0.05 | [-0.10,  0.08] | -0.22 | 0.828 
## Plot Cover T               |       -0.08 | 0.01 | [-0.11, -0.05] | -5.33 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: Site) |        0.00
## SD (Residual)        |        0.08
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#checking the model
check_model(plantsimp_mod1_full)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(plantsimp_mod1_full)
## # Overdispersion test
## 
##  dispersion ratio = 0.882
##           p-value =   0.6
## No overdispersion detected.
#collinearity
check_collinearity(plantsimp_mod1_full)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##            Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##       Site_type 2.03 [1.51, 3.10]         1.43      0.49     [0.32, 0.66]
##  majority_class 2.19 [1.60, 3.34]         1.48      0.46     [0.30, 0.62]
##    Plot_Cover_T 1.34 [1.10, 2.10]         1.16      0.75     [0.48, 0.91]
# Compute fitted values and Pearson residuals
plantsimp_mod1_vals <- fitted(plantsimp_mod1_full)
plantsimp_mod1_residuals <- residuals(plantsimp_mod1_full, type = "pearson")
# Create binned residuals plot
arm::binnedplot(plantsimp_mod1_vals, plantsimp_mod1_residuals)

#DHARMa package - simulate residuals and check model assumptions
plantsimp_mod1_sim_res <- simulateResiduals(fittedModel = plantsimp_mod1_full)
plot(plantsimp_mod1_sim_res)
## qu = 0.25, log(sigma) = -2.950879 : outer Newton did not converge fully.

testDispersion(plantsimp_mod1_full) 

## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.88224, p-value = 0.6
## alternative hypothesis: two.sided
plantsimp_mod1_beta <- glmmTMB(Floral_simpson_index_T 
                               ~ Site_type 
                               + majority_class #majority labndcover class in 1km buffer zone around the site
                               #+ agri + grass + snh + forest + urban + water
                               + Plot_Cover_T 
                               + top2_ratio
                               + (1|Site),
                    #ziformula = ~1,  # allows zero-inflation if needed
                    family = beta_family(),
                    data = scaled_envir_data)

summary(plantsimp_mod1_beta)
##  Family: beta  ( logit )
## Formula:          
## Floral_simpson_index_T ~ Site_type + majority_class + Plot_Cover_T +  
##     top2_ratio + (1 | Site)
## Data: scaled_envir_data
## 
##      AIC      BIC   logLik deviance df.resid 
##   -100.7    -84.4     59.3   -118.7       36 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  Site   (Intercept) 7.19e-11 8.48e-06
## Number of obs: 45, groups:  Site, 9
## 
## Dispersion parameter for beta family (): 30.7 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -1.46062    0.18507  -7.892 2.97e-15 ***
## Site_typeyoung_restored -0.31033    0.19319  -1.606    0.108    
## majority_classforest     0.12716    0.23260   0.547    0.585    
## majority_classgrass      0.05842    0.18633   0.314    0.754    
## majority_classurban      0.22105    0.28849   0.766    0.444    
## Plot_Cover_T            -0.63254    0.11047  -5.726 1.03e-08 ***
## top2_ratio              -0.08312    0.08148  -1.020    0.308    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(plantsimp_mod1_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |       -1.46 | 0.19 | [-1.82, -1.10] | -7.89 | < .001
## Site type [young_restored] |       -0.31 | 0.19 | [-0.69,  0.07] | -1.61 | 0.108 
## majority class [forest]    |        0.13 | 0.23 | [-0.33,  0.58] |  0.55 | 0.585 
## majority class [grass]     |        0.06 | 0.19 | [-0.31,  0.42] |  0.31 | 0.754 
## majority class [urban]     |        0.22 | 0.29 | [-0.34,  0.79] |  0.77 | 0.444 
## Plot Cover T               |       -0.63 | 0.11 | [-0.85, -0.42] | -5.73 | < .001
## top2 ratio                 |       -0.08 | 0.08 | [-0.24,  0.08] | -1.02 | 0.308 
## 
## # Dispersion
## 
## Parameter   | Coefficient |         95% CI
## ------------------------------------------
## (Intercept) |       30.71 | [20.31, 46.45]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: Site) |    8.48e-06 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
## 
## The model has a log- or logit-link. Consider using `exponentiate =
##   TRUE` to interpret coefficients as ratios.
#checking the model
check_model(plantsimp_mod1_beta)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(plantsimp_mod1_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.053
##           p-value = 0.792
## No overdispersion detected.
#collinearity
check_collinearity(plantsimp_mod1_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##            Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##       Site_type 2.06 [1.55, 3.07]         1.44      0.48     [0.33, 0.65]
##  majority_class 2.48 [1.81, 3.71]         1.57      0.40     [0.27, 0.55]
##    Plot_Cover_T 1.24 [1.06, 1.97]         1.12      0.80     [0.51, 0.94]
##      top2_ratio 1.14 [1.02, 2.11]         1.07      0.88     [0.47, 0.98]
# Compute fitted values and Pearson residuals
plantsimp_mod1_beta_vals <- fitted(plantsimp_mod1_beta)
plantsimp_mod1_beta_residuals <- residuals(plantsimp_mod1_beta, type = "pearson")
# Create binned residuals plot
arm::binnedplot(plantsimp_mod1_beta_vals, plantsimp_mod1_beta_residuals)

#DHARMa package - simulate residuals and check model assumptions
plantsimp_mod1_beta_sim_res <- simulateResiduals(fittedModel = plantsimp_mod1_beta)
plot(plantsimp_mod1_beta_sim_res)

IV.A.1.a. Visualize the model

#plot_model(plantsimp_mod1_beta , type = "est", show.values = TRUE, value.offset = .3, title = "Flower survey: Floral Simpson Index") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(
    title = "Flower survey: Floral Simpson Index",
    x = "Predictor",
    y = "Estimate")
## List of 4
##  $ axis.text.x:List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 1
##   ..$ vjust        : NULL
##   ..$ angle        : num 45
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi FALSE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ x          : chr "Predictor"
##  $ y          : chr "Estimate"
##  $ title      : chr "Flower survey: Floral Simpson Index"
##  - attr(*, "class")= chr [1:2] "theme" "gg"
##  - attr(*, "complete")= logi FALSE
##  - attr(*, "validate")= logi TRUE
(flower_survey_est <- plot_model(plantsimp_mod1_beta, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Top 2 Flower Ratio",
                           "Flower Cover % per transect",  
                           "Urban (1km Buffer)",  
                           "Grassland (1km Buffer)",  
                           "Forest (1km Buffer)", 
                           "Young Restored Site")) +
    labs(title = "Floral Simpson Index", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0)))  # 0 = left, 1 = right

## plot cover plantsimp_mod1_full ---------
# Get the original mean and SD of Plot_Cover_T before scaling
plot_cover_mean <- mean(envir_data$Plot_Cover_T, na.rm = TRUE)
plot_cover_sd <- sd(envir_data$Plot_Cover_T, na.rm = TRUE)
# Get predictions on the scaled variable
pred_plot_cover <- ggpredict(plantsimp_mod1_beta , terms = "Plot_Cover_T")
## You are calculating adjusted predictions on the population-level (i.e.
##   `type = "fixed"`) for a *generalized* linear mixed model.
##   This may produce biased estimates due to Jensen's inequality. Consider
##   setting `bias_correction = TRUE` to correct for this bias.
##   See also the documentation of the `bias_correction` argument.
# Unscale the x-axis
pred_plot_cover$x_unscaled <- (pred_plot_cover$x * plot_cover_sd) + plot_cover_mean
# Plot

(coverperc <- ggplot(pred_plot_cover, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Plot_Cover_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Plot_Cover_T"]], 0.5)) +
  labs(
    title = "Predicted Floral Simpson Index vs Flower Cover %",
    x = "Flower Cover per transect (%)",
    y = "Predicted Floral Simpson Index"
  ))

#cowplot the two plots
cowplot::plot_grid(flower_survey_est, coverperc, rel_widths = c(1.2, 1), ncol = 2, labels = c("A", "B"), label_size = 12) +
  theme(plot.title = element_text(hjust = 0.5))

ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/flower_survey_estimates.png",
       width = 10, height = 6, dpi = 600)

rm(coverperc, flower_survey_est)
# Make predictions at -1 SD, mean (0), and +1 SD of flower cover
new_data <- data.frame(
  Site_type = "reference",  # pick your reference level
  majority_class = "grass",  # pick your reference level
  Plot_Cover_T = c(-1, 0, 1),  # -1 SD, mean, +1 SD
  top2_ratio = 0,  # set other numeric predictors at their mean (which is 0 if scaled)
  Site = NA  # random effect NA for population-level prediction
)

# Predict (on response scale, i.e., Floral Simpson Index)
pred <- predict(plantsimp_mod1_beta, 
                newdata = new_data, 
                type = "response", 
                se.fit = TRUE)

# Combine predictions into a table
results <- new_data %>%
  mutate(
    Predicted_Diversity = pred$fit,
    SE = pred$se.fit,
    Lower_CI = pred$fit - 1.96 * pred$se.fit,
    Upper_CI = pred$fit + 1.96 * pred$se.fit
  )

print(results)
##   Site_type majority_class Plot_Cover_T top2_ratio Site Predicted_Diversity
## 1 reference          grass           -1          0   NA           0.3165516
## 2 reference          grass            0          0   NA           0.1974670
## 3 reference          grass            1          0   NA           0.1156036
##           SE   Lower_CI  Upper_CI
## 1 0.03340236 0.25108302 0.3820203
## 2 0.02058080 0.15712862 0.2378053
## 3 0.01893581 0.07848939 0.1527178

IV.A.1.b. Flower Survey Model Interpretation of Results

  • Model Type: Generalized linear mixed-effects model (glmmTMB)
    • Family: Beta distribution with a logit link (suitable for proportions between 0 and 1)
    • Random Effect: Site (included as random intercept)
  • Model’s Assumptions:
    • Distribution: Beta distribution is appropriate for the response variable (bounded between 0 and 1), not normally distributed.
    • No overdispersion: Dispersion ratio close to 1 (1.052, p = 0.776).
    • No multicollinearity: All predictors have low variance inflation factors (all VIF below 2.5)
    • Singular fit warning: While the model reported a potential singular fit due to an extremely small random effect variance for Site (SD ≈ 9.3e-06), this suggests that most of the variation was captured by the fixed effects, and random intercepts contributed minimally.
    • Residual diagnostics: No significant issues detected (e.g., no zero-inflation, no major residual structure, no significant outliers)
  • Response Variable:
    • Floral_simpson_index_T — Floral Simpson diversity index at a transect level.
  • Fixed Effects Included:
    • Site_type — young restored vs reference
    • majority_class — Dominant land cover type in 1 km buffer at site level (e.g., forest, grassland, urban)
    • Plot_Cover_T — Total flower cover in % per transect (scaled)
    • top2_ratio — Relative dominance ratio of the two most abundant species at a transect level (scaled)
  • Results:
    • Total plant cover (Plot_Cover_T) had a strong, significant negative effect on floral diversity
      • Estimate = -0.638, p < 0.001
    • Other predictors (site type, land cover class, species dominance) had non-significant effects
    • Random effect (Site) showed negligible variance, suggesting minimal site-to-site variation after accounting for fixed effects
  • Interpretation:
    • Higher flower cover percentage was associated with lower floral diversity — possibly due to competitive exclusion or dominance of a few species.
    • Local vegetation structure (cover) had more influence than broader landscape context (site type or surrounding land cover)

IV.B. Netting data

For the netting data, the predictor variables are: - Floral Simpson Index + Simpson index captures dominant flowering species, aligning with the nested structure of plant–pollinator networks: many specialists rely on a few generalists. This skewed interaction pattern makes generalist flowers key to maintaining pollinator communities, justifying their emphasis in diversity modeling. - Minutes since 9 am + Time since 9 am, to account for the time of day. Each netting session is made to be 30 minutes of active netting, so the sampling effort is the same for all transects. - Top 2 ratio + Relative abundance ratio of D. carota and P. sativa combined, used as a proxy for dominance shifts. - Site type + Young restored sites (1 to 5 years ago) vs reference sites. - Daily mean wind velocity + Daily mean wind velocity (m/s) measured at the nearest weather station to account for weather conditions. - Daily mean temperature + Daily mean temperature (°C) measured at the nearest weather station to account for weather conditions. - Days since start + Days since the start of the sampling to account for seasonal effects.

Majority class or land cover type (agri, grass, snh, forest, urban, water) were not included in the model. Since we have only 5 transects per site (5x9= 45 units), we need to be careful with the number of predictors we include in the model. The following variables were not included in the model, because they were not significant for the floral simpson index, and because they are not very precise (100m resolution), and would not show a meaningful impact on our sites. (For example, for categories like agricultural areas, the type of crop cultivated is not known which could have a big impact on the floral diversity).

IV.B.1. NETTING Interaction counts - Poisson

#scale the last envir_data - Floral simpson_index
scaled_envir_data <- envir_data %>%
  #z transform the numerical columns, exceptFloral_simpson_index_T
  mutate(across(where(is.numeric), scale))

netting1 <- netting %>%
  #right date format
  mutate(date = as.Date(date, format = "%Y-%m-%d"))%>%
  #summarize by site, transect
  group_by(site, transect) %>%
  summarise(total_interaction_T = sum(total_interaction), .groups = "drop")%>%
   #join the envir_data with netting
  left_join(scaled_envir_data, by = c("site"="Site", "transect"="Transect"))

#histogram of total_interaction_T
netting1 %>%
  ggplot(aes(x = total_interaction_T)) +
  geom_histogram(fill = "lightblue", color = "black") +
  labs(title = "Histogram of Transect Walk Counts",
       x = "total interactions",
       y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The first model has too many predictors and is over fitting the data. A lot of the predictors are significant, but the assumptions of the model are not met (overdispersion(p < 0.001), S-shape around qqplot, and the residuals appear to be heteroskedastic (residuals vs predicted plot)). We will be removing predictors one by one and checking the model assumptions after each step.

The comparison of the models shows that the model with the lowest AICc and BIC is the third model (netting_mod3_full), which has the lowest AICc and BIC values. However, even in this model, the AICc and BIC values are quite high, indicating that the model is not the best fit for the data. In the next step, we will try to run a negative binomial model to see if it improves the fit of the model.

IV.B.1.a. Interpretation of results for POISSON GLMM

netting_mod3_full netting_mod3_full <- glmer(total_interaction_T ~Floral_simpson_index_T + minutes_since_9am + top2_ratio + Site_type + dm_wind_velocity + (1|site), data = netting1, family = “poisson”) Model summary: - has 5 predictors and one random effect - has a significant effect of the floral simpson index (p < 0.001), time category (p < 0.001), top2_ratio (p < 0.001), has a significant effect of dm_wind_velocity (p = 0.014 ) - has a marginal effect of site type (p = 0.051)

Model assumptions: - Over dispersion was detected (p < 0.001) with the performance package check_overdispersion() + another model could be used - negative binomial + the DHARMa qqplot showed the same pattern - dispersion is not normal (p < 0.001) - the DHARMa qqplot also indicated significant outliers. - Correlation is low between predictors (VIF < 2) with the performance package check_collinearity() - The binned residuals plot from the arm package shows no clear trend.

IV.B.2. NETTING - Interaction counts NB

We start again with the full model, but this time we use a negative binomial distribution.

# full model with interaction counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# negative binomial distribution

netting_mod1_NB <- glmer.nb(total_interaction_T 
                            ~Floral_simpson_index_T 
                            + minutes_since_9am
                            + top2_ratio
                            + Site_type
                            + dm_wind_velocity
                            + dm_temperature
                            + Plot_Cover_T
                            #+  majority_class
                            #+ urban + agri + grass + snh + forest + water
                            + Days_since_start
                            + (1|site), 
                            data = netting1, 
                            family = nbinom2)
## boundary (singular) fit: see help('isSingular')
summary(netting_mod1_NB)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(11.8662)  ( log )
## Formula: total_interaction_T ~ Floral_simpson_index_T + minutes_since_9am +  
##     top2_ratio + Site_type + dm_wind_velocity + dm_temperature +  
##     Plot_Cover_T + Days_since_start + (1 | site)
##    Data: netting1
## 
##      AIC      BIC   logLik deviance df.resid 
##    395.0    414.8   -186.5    373.0       34 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.59318 -0.73377  0.00215  0.62483  2.66430 
## 
## Random effects:
##  Groups Name        Variance Std.Dev. 
##  site   (Intercept) 3.27e-11 5.718e-06
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              4.09649    0.07999  51.214  < 2e-16 ***
## Floral_simpson_index_T  -0.30671    0.07666  -4.001 6.31e-05 ***
## minutes_since_9am        0.06231    0.05737   1.086 0.277447    
## top2_ratio               0.24076    0.05091   4.729 2.26e-06 ***
## Site_typeyoung_restored -0.48107    0.14612  -3.292 0.000994 ***
## dm_wind_velocity        -0.13013    0.07808  -1.667 0.095597 .  
## dm_temperature           0.09035    0.08889   1.016 0.309433    
## Plot_Cover_T            -0.21863    0.08288  -2.638 0.008343 ** 
## Days_since_start         0.04848    0.06104   0.794 0.427024    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt St_ty_ dm_wn_ dm_tmp Pl_C_T
## Flrl_smp__T -0.127                                                 
## mnts_snc_9m  0.216  0.091                                          
## top2_ratio  -0.104  0.194 -0.144                                   
## St_typyng_r -0.794  0.190 -0.271  0.101                            
## dm_wnd_vlct  0.396  0.002  0.175  0.030 -0.479                     
## dm_tempertr  0.547 -0.029  0.246  0.091 -0.681  0.682              
## Plot_Covr_T -0.331  0.640  0.050 -0.043  0.429 -0.307 -0.445       
## Dys_snc_str  0.080  0.142  0.239 -0.015 -0.096 -0.359 -0.054  0.115
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_mod1_NB)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     4.10 | 0.08 | [ 3.94,  4.25] | 51.21 | < .001
## Floral simpson index T     |    -0.31 | 0.08 | [-0.46, -0.16] | -4.00 | < .001
## minutes since 9am          |     0.06 | 0.06 | [-0.05,  0.17] |  1.09 | 0.277 
## top2 ratio                 |     0.24 | 0.05 | [ 0.14,  0.34] |  4.73 | < .001
## Site type [young_restored] |    -0.48 | 0.15 | [-0.77, -0.19] | -3.29 | < .001
## dm wind velocity           |    -0.13 | 0.08 | [-0.28,  0.02] | -1.67 | 0.096 
## dm temperature             |     0.09 | 0.09 | [-0.08,  0.26] |  1.02 | 0.309 
## Plot Cover T               |    -0.22 | 0.08 | [-0.38, -0.06] | -2.64 | 0.008 
## Days since start           |     0.05 | 0.06 | [-0.07,  0.17] |  0.79 | 0.427 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: site) |    5.72e-06
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_mod1_NB, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_mod1_NB)
## # Overdispersion test
## 
##  dispersion ratio = 1.559
##           p-value = 0.128
## No overdispersion detected.
#collinearity
check_collinearity(netting_mod1_NB)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.22 [1.66, 3.27]         1.49      0.45
##       minutes_since_9am 1.25 [1.06, 1.93]         1.12      0.80
##              top2_ratio 1.16 [1.03, 1.97]         1.08      0.86
##               Site_type 2.22 [1.66, 3.27]         1.49      0.45
##        dm_wind_velocity 2.45 [1.80, 3.62]         1.57      0.41
##          dm_temperature 3.18 [2.27, 4.74]         1.78      0.31
##            Plot_Cover_T 2.63 [1.92, 3.90]         1.62      0.38
##        Days_since_start 1.48 [1.19, 2.18]         1.22      0.68
##  Tolerance 95% CI
##      [0.31, 0.60]
##      [0.52, 0.94]
##      [0.51, 0.97]
##      [0.31, 0.60]
##      [0.28, 0.55]
##      [0.21, 0.44]
##      [0.26, 0.52]
##      [0.46, 0.84]
# Compute fitted values and Pearson residuals
netting_mod1_NB_vals <- fitted(netting_mod1_NB)
netting_mod1_NB_residuals <- residuals(netting_mod1_NB, type = "pearson")

# Create binned residuals plot
arm::binnedplot(netting_mod1_NB_vals, netting_mod1_NB_residuals)

#DHARMa package - simulate residuals and check model assumptions
netting_mod1_NB_sim_res <- simulateResiduals(fittedModel = netting_mod1_NB)
plot(netting_mod1_NB_sim_res)

With this model, the assumptions are met (qqplot is normal, no overdispersion, and the residuals are homoscedastic). There are still too many predictors for the amount of data we have, so we will remove the predictors one by one and check the model assumptions after each step.

# full model with interaction counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# remove minutes_since_9am (p= 0.376 for netting_mod1_NB)

netting_mod2_NB <- glmer.nb(total_interaction_T 
                           ~Floral_simpson_index_T 
                           #+ minutes_since_9am
                           + top2_ratio
                           + Site_type
                           + dm_wind_velocity
                           + dm_temperature
                           + Plot_Cover_T
                           #+  majority_class
                           #+ urban + agri + grass + snh + forest + water
                           + Days_since_start
                           + (1|site), 
                           data = netting1,
                           family = nbinom2)
## boundary (singular) fit: see help('isSingular')
summary(netting_mod2_NB)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(11.6287)  ( log )
## Formula: 
## total_interaction_T ~ Floral_simpson_index_T + top2_ratio + Site_type +  
##     dm_wind_velocity + dm_temperature + Plot_Cover_T + Days_since_start +  
##     (1 | site)
##    Data: netting1
## 
##      AIC      BIC   logLik deviance df.resid 
##    394.1    412.2   -187.1    374.1       35 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.6440 -0.6962  0.1069  0.5941  2.3136 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 2.801e-12 1.674e-06
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              4.07832    0.07850  51.950  < 2e-16 ***
## Floral_simpson_index_T  -0.31441    0.07626  -4.123 3.74e-05 ***
## top2_ratio               0.24888    0.05086   4.894 9.89e-07 ***
## Site_typeyoung_restored -0.43745    0.14124  -3.097  0.00195 ** 
## dm_wind_velocity        -0.14513    0.07735  -1.876  0.06062 .  
## dm_temperature           0.06644    0.08646   0.768  0.44225    
## Plot_Cover_T            -0.22321    0.08248  -2.706  0.00681 ** 
## Days_since_start         0.03237    0.05961   0.543  0.58704    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T tp2_rt St_ty_ dm_wn_ dm_tmp Pl_C_T
## Flrl_smp__T -0.145                                          
## top2_ratio  -0.075  0.201                                   
## St_typyng_r -0.781  0.218  0.065                            
## dm_wnd_vlct  0.372 -0.013  0.054 -0.455                     
## dm_tempertr  0.520 -0.050  0.128 -0.657  0.667              
## Plot_Covr_T -0.348  0.636 -0.042  0.458 -0.319 -0.470       
## Dys_snc_str  0.026  0.125  0.023 -0.029 -0.423 -0.122  0.108
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_mod2_NB)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     4.08 | 0.08 | [ 3.92,  4.23] | 51.95 | < .001
## Floral simpson index T     |    -0.31 | 0.08 | [-0.46, -0.16] | -4.12 | < .001
## top2 ratio                 |     0.25 | 0.05 | [ 0.15,  0.35] |  4.89 | < .001
## Site type [young_restored] |    -0.44 | 0.14 | [-0.71, -0.16] | -3.10 | 0.002 
## dm wind velocity           |    -0.15 | 0.08 | [-0.30,  0.01] | -1.88 | 0.061 
## dm temperature             |     0.07 | 0.09 | [-0.10,  0.24] |  0.77 | 0.442 
## Plot Cover T               |    -0.22 | 0.08 | [-0.38, -0.06] | -2.71 | 0.007 
## Days since start           |     0.03 | 0.06 | [-0.08,  0.15] |  0.54 | 0.587 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: site) |    1.67e-06
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_mod2_NB, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_mod2_NB)
## # Overdispersion test
## 
##  dispersion ratio = 1.471
##           p-value = 0.136
## No overdispersion detected.
#collinearity
check_collinearity(netting_mod2_NB)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.18 [1.62, 3.24]         1.48      0.46
##              top2_ratio 1.13 [1.02, 2.14]         1.07      0.88
##               Site_type 2.04 [1.53, 3.03]         1.43      0.49
##        dm_wind_velocity 2.37 [1.74, 3.53]         1.54      0.42
##          dm_temperature 2.95 [2.10, 4.44]         1.72      0.34
##            Plot_Cover_T 2.60 [1.88, 3.90]         1.61      0.38
##        Days_since_start 1.39 [1.14, 2.10]         1.18      0.72
##  Tolerance 95% CI
##      [0.31, 0.62]
##      [0.47, 0.98]
##      [0.33, 0.65]
##      [0.28, 0.58]
##      [0.23, 0.48]
##      [0.26, 0.53]
##      [0.48, 0.88]
# Compute fitted values and Pearson residuals
netting_mod2_NB_vals <- fitted(netting_mod2_NB)
netting_mod2_NB_residuals <- residuals(netting_mod2_NB, type = "pearson")

# Create binned residuals plot
arm::binnedplot(netting_mod2_NB_vals, netting_mod2_NB_residuals)

#DHARMa package - simulate residuals and check model assumptions
netting_mod2_NB_sim_res <- simulateResiduals(fittedModel = netting_mod2_NB)
plot(netting_mod2_NB_sim_res)

# full model with interaction counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# remove days since start ( p= 0.441  for netting_mod2_NB)

netting_mod3_NB <- glmer.nb(total_interaction_T 
                           ~Floral_simpson_index_T 
                           #+ minutes_since_9am
                           + top2_ratio
                           + Site_type
                           + dm_wind_velocity
                           + dm_temperature
                           + Plot_Cover_T
                           #+  majority_class
                           #+ urban + agri + grass + snh + forest + water
                           #+ Days_since_start
                           + (1|site), 
                           data = netting1,
                           family = nbinom2)
## boundary (singular) fit: see help('isSingular')
summary(netting_mod3_NB)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(11.47)  ( log )
## Formula: 
## total_interaction_T ~ Floral_simpson_index_T + top2_ratio + Site_type +  
##     dm_wind_velocity + dm_temperature + Plot_Cover_T + (1 | site)
##    Data: netting1
## 
##      AIC      BIC   logLik deviance df.resid 
##    392.4    408.7   -187.2    374.4       36 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.66809 -0.62795  0.02883  0.56021  2.48985 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 3.392e-11 5.824e-06
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              4.07733    0.07906  51.570  < 2e-16 ***
## Floral_simpson_index_T  -0.31961    0.07618  -4.196 2.72e-05 ***
## top2_ratio               0.24825    0.05114   4.854 1.21e-06 ***
## Site_typeyoung_restored -0.43497    0.14232  -3.056  0.00224 ** 
## dm_wind_velocity        -0.12734    0.07049  -1.806  0.07084 .  
## dm_temperature           0.07216    0.08659   0.833  0.40466    
## Plot_Cover_T            -0.22800    0.08263  -2.759  0.00579 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T tp2_rt St_ty_ dm_wn_ dm_tmp
## Flrl_smp__T -0.152                                   
## top2_ratio  -0.078  0.206                            
## St_typyng_r -0.782  0.225  0.069                     
## dm_wnd_vlct  0.426  0.043  0.077 -0.518              
## dm_tempertr  0.528 -0.037  0.134 -0.665  0.684       
## Plot_Covr_T -0.353  0.632 -0.042  0.463 -0.306 -0.464
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_mod3_NB)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     4.08 | 0.08 | [ 3.92,  4.23] | 51.57 | < .001
## Floral simpson index T     |    -0.32 | 0.08 | [-0.47, -0.17] | -4.20 | < .001
## top2 ratio                 |     0.25 | 0.05 | [ 0.15,  0.35] |  4.85 | < .001
## Site type [young_restored] |    -0.43 | 0.14 | [-0.71, -0.16] | -3.06 | 0.002 
## dm wind velocity           |    -0.13 | 0.07 | [-0.27,  0.01] | -1.81 | 0.071 
## dm temperature             |     0.07 | 0.09 | [-0.10,  0.24] |  0.83 | 0.405 
## Plot Cover T               |    -0.23 | 0.08 | [-0.39, -0.07] | -2.76 | 0.006 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: site) |    5.82e-06
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_mod3_NB, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_mod3_NB)
## # Overdispersion test
## 
##  dispersion ratio = 1.520
##           p-value = 0.136
## No overdispersion detected.
#collinearity
check_collinearity(netting_mod3_NB)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.15 [1.59, 3.24]         1.47      0.46
##              top2_ratio 1.14 [1.02, 2.17]         1.07      0.88
##               Site_type 2.05 [1.53, 3.08]         1.43      0.49
##        dm_wind_velocity 1.94 [1.46, 2.92]         1.39      0.51
##          dm_temperature 2.90 [2.05, 4.41]         1.70      0.35
##            Plot_Cover_T 2.58 [1.85, 3.91]         1.60      0.39
##  Tolerance 95% CI
##      [0.31, 0.63]
##      [0.46, 0.98]
##      [0.32, 0.65]
##      [0.34, 0.68]
##      [0.23, 0.49]
##      [0.26, 0.54]
# Compute fitted values and Pearson residuals
netting_mod3_NB_vals <- fitted(netting_mod3_NB)
netting_mod3_NB_residuals <- residuals(netting_mod3_NB, type = "pearson")

# Create binned residuals plot
arm::binnedplot(netting_mod3_NB_vals, netting_mod3_NB_residuals)

#DHARMa package - simulate residuals and check model assumptions
netting_mod3_NB_sim_res <- simulateResiduals(fittedModel = netting_mod3_NB)
plot(netting_mod3_NB_sim_res)

#  model with pollinator interaction counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# remove dm temperature (p= 0.266  for netting_mod3_NB)
netting_mod4_NB <- glmer.nb(total_interaction_T 
                           ~Floral_simpson_index_T 
                           #+ minutes_since_9am
                           + top2_ratio
                           + Site_type
                           + dm_wind_velocity
                           + Plot_Cover_T
                           #+ dm_temperature
                           #+  majority_class
                           #+ urban + agri + grass + snh + forest + water
                           #+ Days_since_start
                           + (1|site), 
                           data = netting1, 
                           family = nbinom2)
## boundary (singular) fit: see help('isSingular')
summary(netting_mod4_NB)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(11.2679)  ( log )
## Formula: 
## total_interaction_T ~ Floral_simpson_index_T + top2_ratio + Site_type +  
##     dm_wind_velocity + Plot_Cover_T + (1 | site)
##    Data: netting1
## 
##      AIC      BIC   logLik deviance df.resid 
##    391.1    405.6   -187.6    375.1       37 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.72295 -0.69167 -0.05456  0.72668  2.37554 
## 
## Random effects:
##  Groups Name        Variance Std.Dev. 
##  site   (Intercept) 7.16e-09 8.462e-05
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              4.04329    0.06759  59.817  < 2e-16 ***
## Floral_simpson_index_T  -0.31748    0.07669  -4.140 3.48e-05 ***
## top2_ratio               0.24235    0.05107   4.746 2.08e-06 ***
## Site_typeyoung_restored -0.35648    0.10692  -3.334 0.000855 ***
## dm_wind_velocity        -0.16734    0.05223  -3.204 0.001355 ** 
## Plot_Cover_T            -0.19590    0.07389  -2.651 0.008022 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T tp2_rt St_ty_ dm_wn_
## Flrl_smp__T -0.154                            
## top2_ratio  -0.172  0.210                     
## St_typyng_r -0.678  0.266  0.208              
## dm_wnd_vlct  0.111  0.093 -0.017 -0.125       
## Plot_Covr_T -0.142  0.693  0.015  0.230  0.013
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_mod4_NB)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     4.04 | 0.07 | [ 3.91,  4.18] | 59.82 | < .001
## Floral simpson index T     |    -0.32 | 0.08 | [-0.47, -0.17] | -4.14 | < .001
## top2 ratio                 |     0.24 | 0.05 | [ 0.14,  0.34] |  4.75 | < .001
## Site type [young_restored] |    -0.36 | 0.11 | [-0.57, -0.15] | -3.33 | < .001
## dm wind velocity           |    -0.17 | 0.05 | [-0.27, -0.06] | -3.20 | 0.001 
## Plot Cover T               |    -0.20 | 0.07 | [-0.34, -0.05] | -2.65 | 0.008 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: site) |    8.46e-05
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_mod4_NB, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_mod4_NB)
## # Overdispersion test
## 
##  dispersion ratio = 1.510
##           p-value = 0.184
## No overdispersion detected.
#collinearity
check_collinearity(netting_mod4_NB)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF    VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.15 [1.58,  3.28]         1.47      0.47
##              top2_ratio 1.12 [1.01,  2.43]         1.06      0.89
##               Site_type 1.14 [1.02,  2.25]         1.07      0.88
##        dm_wind_velocity 1.04 [1.00, 30.65]         1.02      0.96
##            Plot_Cover_T 2.02 [1.50,  3.08]         1.42      0.49
##  Tolerance 95% CI
##      [0.31, 0.63]
##      [0.41, 0.99]
##      [0.45, 0.98]
##      [0.03, 1.00]
##      [0.32, 0.67]
# Compute fitted values and Pearson residuals
netting_mod4_NB_vals <- fitted(netting_mod4_NB)
netting_mod4_NB_residuals <- residuals(netting_mod4_NB, type = "pearson")
# Create binned residuals plot
arm::binnedplot(netting_mod4_NB_vals, netting_mod4_NB_residuals)

#check singularity
isSingular(netting_mod4_NB)
## [1] TRUE
#DHARMa package - simulate residuals and check model assumptions
netting_mod4_NB_sim_res <- simulateResiduals(fittedModel = netting_mod4_NB)
plot(netting_mod4_NB_sim_res)

#removing floral_simpson (p=0.1731 for netting_mod4_NB)
netting_mod5_NB <- glmer.nb(total_interaction_T 
                           #~Floral_simpson_index_T 
                           #+ minutes_since_9am
                           ~ top2_ratio
                           + Site_type
                           + dm_wind_velocity
                           + Plot_Cover_T
                           #+ dm_temperature
                           #+  majority_class
                           #+ urban + agri + grass + snh + forest + water
                           #+ Days_since_start
                           + (1|site), 
                           data = netting1, 
                           family = nbinom2)
## boundary (singular) fit: see help('isSingular')
summary(netting_mod5_NB)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(7.7409)  ( log )
## Formula: total_interaction_T ~ top2_ratio + Site_type + dm_wind_velocity +  
##     Plot_Cover_T + (1 | site)
##    Data: netting1
## 
##      AIC      BIC   logLik deviance df.resid 
##    403.4    416.0   -194.7    389.4       38 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.62019 -0.74681 -0.07081  0.47913  2.57753 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 4.771e-11 6.907e-06
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              4.01109    0.07822  51.278  < 2e-16 ***
## top2_ratio               0.28488    0.05901   4.827 1.38e-06 ***
## Site_typeyoung_restored -0.23644    0.12071  -1.959   0.0501 .  
## dm_wind_velocity        -0.14979    0.06055  -2.474   0.0134 *  
## Plot_Cover_T             0.01912    0.06490   0.295   0.7683    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) tp2_rt St_ty_ dm_wn_
## top2_ratio  -0.157                     
## St_typyng_r -0.673  0.194              
## dm_wnd_vlct  0.124 -0.023 -0.158       
## Plot_Covr_T -0.036 -0.127  0.052 -0.085
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_mod5_NB)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     4.01 | 0.08 | [ 3.86,  4.16] | 51.28 | < .001
## top2 ratio                 |     0.28 | 0.06 | [ 0.17,  0.40] |  4.83 | < .001
## Site type [young_restored] |    -0.24 | 0.12 | [-0.47,  0.00] | -1.96 | 0.050 
## dm wind velocity           |    -0.15 | 0.06 | [-0.27, -0.03] | -2.47 | 0.013 
## Plot Cover T               |     0.02 | 0.06 | [-0.11,  0.15] |  0.29 | 0.768 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: site) |    6.91e-06
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_mod5_NB, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_mod5_NB)
## # Overdispersion test
## 
##  dispersion ratio = 1.677
##           p-value = 0.112
## No overdispersion detected.
#collinearity
check_collinearity(netting_mod5_NB)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF     VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##        top2_ratio 1.06 [1.00,   6.83]         1.03      0.94     [0.15, 1.00]
##         Site_type 1.07 [1.00,   4.79]         1.03      0.93     [0.21, 1.00]
##  dm_wind_velocity 1.03 [1.00, 112.07]         1.02      0.97     [0.01, 1.00]
##      Plot_Cover_T 1.03 [1.00, 219.73]         1.01      0.97     [0.00, 1.00]
# Compute fitted values and Pearson residuals
netting_mod5_NB_vals <- fitted(netting_mod5_NB)
netting_mod5_NB_residuals <- residuals(netting_mod5_NB, type = "pearson")
# Create binned residuals plot
arm::binnedplot(netting_mod5_NB_vals, netting_mod5_NB_residuals)

#DHARMa package - simulate residuals and check model assumptions
netting_mod5_NB_sim_res <- simulateResiduals(fittedModel = netting_mod5_NB)
plot(netting_mod5_NB_sim_res)

IV.B.2.a. Compare the NB models with the performance package

# Compare the models with the performance package
netting_NB_comp1 <- compare_performance(netting_mod1_NB, netting_mod2_NB, netting_mod3_NB, netting_mod4_NB, netting_mod5_NB,  
                                        metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))

# Print the comparison table
print(netting_NB_comp1)
## # Comparison of Model Performance Indices
## 
## Name            |    Model | AICc (weights) | BIC (weights) |   RMSE
## --------------------------------------------------------------------
## netting_mod1_NB | glmerMod |  403.0 (0.014) | 414.8 (0.008) | 22.693
## netting_mod2_NB | glmerMod |  400.6 (0.047) | 412.2 (0.029) | 21.991
## netting_mod3_NB | glmerMod |  397.6 (0.213) | 408.7 (0.167) | 22.701
## netting_mod4_NB | glmerMod |  395.1 (0.724) | 405.6 (0.792) | 22.459
## netting_mod5_NB | glmerMod |  406.4 (0.003) | 416.0 (0.004) | 27.932
## 
## Name            | R2 (cond.) | R2 (marg.) |       ICC
## -----------------------------------------------------
## netting_mod1_NB |            |            |          
## netting_mod2_NB |            |            |          
## netting_mod3_NB |            |            |          
## netting_mod4_NB |      0.615 |      0.615 | 6.999e-08
## netting_mod5_NB |            |            |

The comparison of the models shows that the model with the lowest AICc and BIC is the fourth model (netting_mod4_NB), which has the lowest AICc and BIC values.

IV.B.2.b. Visualize the model results

netting_mod4_NB Parameter | Log-Mean | SE | 95% CI | z | p (Intercept) | 4.04 | 0.07 | [ 3.90, 4.18] | 57.64 | < .001 Floral simpson index T | -0.31 | 0.08 | [-0.46, -0.16] | -4.01 | < .001 top2 ratio | 0.24 | 0.05 | [ 0.14, 0.34] | 4.67 | < .001 Site type [young_restored] | -0.36 | 0.11 | [-0.58, -0.14] | -3.25 | 0.001 dm wind velocity | -0.15 | 0.05 | [-0.26, -0.05] | -2.82 | 0.005 Plot Cover T | -0.17 | 0.07 | [-0.31, -0.02] | -2.29 | 0.022

library(sjPlot)
#plot_model(netting_mod1_NB, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_mod2_NB, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_mod3_NB, type = "est", show.values = TRUE, value.offset = 0.3)

plot_model(netting_mod4_NB, type = "est", show.values = TRUE, value.offset = 0.3) ## KEEPER

(est_count_net <- plot_model(netting_mod4_NB, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Flower Cover % per Transect",
                           "Wind Velocity (km/h)",
                           "Young Restored Site",
                           "Top 2 Flower Ratio",
                           "Floral Simpson Index")) +
    labs(title = "", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0)))  # 0 = left, 1 = right

#plot_model(netting_mod5_NB, type = "est", show.values = TRUE, value.offset = 0.3)
## Site Type netting_mod4_NB ------------  
pred <- ggpredict(netting_mod4_NB, terms = "Site_type")

# Manually assign group labels in the correct order
pred$group <- factor(c("Reference", "Young Restored"), levels = c("Reference", "Young Restored"))

# Plot using your custom colors
cc <- c("Reference" = "#1F78B4", "Young Restored" = "#9BB655FF")

(sitetype_count_net <- plot(pred) +
  labs(
    title = "Predicted Insect Counts vs Site Type",
    x = "Site Type",
    y = "Predicted Count of insects caught \nduring transect walk"
  ) +
  scale_color_manual(values = cc) +
  scale_fill_manual(values = cc) +
  theme_new +
  theme(legend.position = "none") +  # Remove the legend
  geom_point(size = 5 ) ) #Increase the point size
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

  #coord_fixed(ratio = 0.04))  

## wind netting_mod4_NB ---------
# Get the original mean and SD of wind velocity before scaling
wind_mean <- mean(envir_data$dm_wind_velocity, na.rm = TRUE)
wind_sd <- sd(envir_data$dm_wind_velocity, na.rm = TRUE)

# Get predictions on the scaled variable
pred_wind <- ggpredict(netting_mod4_NB , terms = "dm_wind_velocity")

# Unscale the x-axis
pred_wind$x_unscaled <- (pred_wind$x * wind_sd) + wind_mean

# Plot
(wind_count_net <- ggplot(pred_wind, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["dm_wind_velocity"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_wind_velocity"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Interaction Counts vs Wind Velocity",
    x = "Wind Velocity (km/hour)",
    y = "Predicted Insect Count"
  ))

## top2_ratio netting_mod4_NB ---------
# Get the original mean and SD of top2_ratio before scaling
top2_mean <- mean(envir_data$top2_ratio, na.rm = TRUE)
top2_sd <- sd(envir_data$top2_ratio, na.rm = TRUE)

# Get predictions on the scaled variable
pred_top2 <- ggpredict(netting_mod4_NB , terms = "top2_ratio")

# Unscale the x-axis
pred_top2$x_unscaled <- (pred_top2$x * top2_sd) + top2_mean

# Plot
(top2_count_net <-ggplot(pred_top2, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["top2_ratio"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["top2_ratio"]], 0.5)) + 
  labs(
    title = "Predicted Interaction Counts vs Ratio of Top 2 \nFlower Species",
    x = "Top 2 Ratio: percentage of D. carota and P. sativa per transect",
    y = "Predicted Count of insects caught \nduring transect walk"
  ))

## Floral_simpson_index_T netting_mod4_NB ---------
# Get the original mean and SD of Floral_simpson_index_T before scaling
floral_mean <- mean(envir_data$Floral_simpson_index_T, na.rm = TRUE)
floral_sd <- sd(envir_data$Floral_simpson_index_T, na.rm = TRUE)

# Get predictions on the scaled variable
pred_floral <- ggpredict(netting_mod4_NB , terms = "Floral_simpson_index_T")

# Unscale the x-axis
pred_floral$x_unscaled <- (pred_floral$x * floral_sd) + floral_mean
# Plot
(flosimp <-ggplot(pred_floral, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Floral_simpson_index_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Floral_simpson_index_T"]], 0.5)) + 
  labs(
    title = "Transect walk: Predicted Interaction Counts vs Floral Simpson Index",
    x = "Floral Simpson Index",
    y = "Predicted Count of insects caught during transect walk"
  ))

## Plot_Cover_T netting_mod4_NB ---------
# Get the original mean and SD of Plot_Cover_T before scaling
plot_cover_mean <- mean(envir_data$Plot_Cover_T, na.rm = TRUE)
plot_cover_sd <- sd(envir_data$Plot_Cover_T, na.rm = TRUE)

# Get predictions on the scaled variable
pred_plot_cover <- ggpredict(netting_mod4_NB , terms = "Plot_Cover_T")

# Unscale the x-axis
pred_plot_cover$x_unscaled <- (pred_plot_cover$x * plot_cover_sd) + plot_cover_mean
# Plot
(floralcover <-ggplot(pred_plot_cover, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Plot_Cover_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Plot_Cover_T"]],0.5)) +
  labs(
    title = "Predicted Interaction Counts vs Floral Cover %",
    x = "Floral Cover: average percentage \nof flower cover per transect",
    y = "Predicted Count of insects caught \nduring transect walk"
  ))

#save combined plots for thesis
# est_count_net,  sitetype_count_net, top2_count_net, floralcover
library(cowplot)

# First row
top_row <- cowplot::plot_grid(
  est_count_net, 
  top2_count_net, 
  ncol = 2, 
  labels = c("A", "B"), 
  label_size = 12, 
  rel_widths = c(1, 1)
)

# Second row
bottom_row <- cowplot::plot_grid(
  sitetype_count_net, 
  floralcover, 
  ncol = 2, 
  labels = c("C", "D"), 
  label_size = 12, 
  rel_widths = c(1, 1)
)

# Combine with vertical spacing
p_grid <- cowplot::plot_grid(
  top_row, 
  NULL,  # Spacer
  bottom_row, 
  ncol = 1, 
  rel_heights = c(1, 0.1, 1)  # Increase 0.1 to make bigger space
)

# Add a common title
(final_plot <- ggdraw() +
  draw_label("Transect walk: Interaction counts", fontface = 'bold', x = 0.5, y = 0.98, hjust = 0.5, size = 14) +
  draw_plot(p_grid, y = 0, height = 0.95))

# Save the final plot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/netting_interaction_counts.png", 
       plot = final_plot, width = 10, height = 8, dpi = 600)
#save est_count_net, top2_count_net, sitetype_count_net
#ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/est_count_net.png", plot = est_count_net, width = 8, height = 6)
#ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/top2_count_net.png", plot = top2_count_net, width = 8, height = 6)
#ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/sitetype_count_net.png", plot = sitetype_count_net, width = 8, height = 6)
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/flosimp_count_net.png", plot = flosimp, width = 8, height = 6, dpi=600)

rm(top_row, bottom_row, p_grid, final_plot)

IV.B.2.c. Interpretation of results

  • Model Type: netting_mod4_NB Generalized linear mixed-effects model (GLMM)
    • Family: Negative Binomial (glmer.nb), log link
    • Random Effect: site (random intercept)
  • Model Fit and Assumptions:
    • No signs of overdispersion (dispersion ratio = 1.52, p = 0.176)
    • Residuals were well-behaved in simulation checks (DHARMa)
    • Multicollinearity was low (VIFs all < 2.2)
    • Random intercept variance was very low (SD = 0.05), suggesting limited variability between sites after accounting for fixed effects
    • No singularity issues were detected
  • Response Variable:
    • total_interaction_T — Total number of observed and collected pollinator interactions
  • Fixed Effects Included:
    • Floral_simpson_index_T — Floral diversity (scaled)
    • top2_ratio — Dominance of top 2 most abundant floral species (scaled)
    • Site_type — Young restored vs. reference sites
    • dm_wind_velocity — Wind speed (scaled)
    • Plot_Cover_T — Total plant cover (scaled)
  • Key Results:
    • Floral Simpson Diversity: Significant negative effect
      • Estimate = -0.31, p < 0.001 → Higher diversity = fewer total interactions
    • Top2 Dominance: Significant positive effect
      • Estimate = 0.24, p < 0.001 → Higher dominance = more interactions
    • Site Type: Young restored sites had fewer interactions than reference
      • Estimate = -0.36, p = 0.001
    • Wind Speed: Significant negative effect on interaction count
      • Estimate = -0.15, p = 0.005
    • Plant Cover: Also showed a negative effect on interaction count
      • Estimate = -0.17, p = 0.022
#removal of unnecessary objects - all starting with netting_
rm(list = ls(pattern = "^netting_mod"))
rm(list = ls(pattern = "^plantsimp_"))

IV.B.3. NETTING Species richness - Poisson

We don’t actually have the insects ID to species level, only some groups, so we’ll use the the lowest taxa available to do this step.

# Create a new data frame with the number of unique families per site
netting_richness <- netting %>%
  group_by(site, transect) %>%
  summarise(unique_taxa = n_distinct(lowest_taxa), .groups = "drop")%>%
  #join the scaled_envir_data 
  left_join(scaled_envir_data, by = c("site"="Site", "transect"="Transect"))

#histogram of unique families, binwidth =1 
netting_richness %>%
  ggplot(aes(x = unique_taxa)) +
  geom_histogram(binwidth = 1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Unique Families",
       x = "Unique Taxa",
       y = "Count")

#is the richness of taxa normally distributed?
shapiro.test(netting_richness$unique_taxa) # p-value = 0.2197, so it is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  netting_richness$unique_taxa
## W = 0.96674, p-value = 0.2197
# full model with family richness as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# Poisson distribution

netting_rich_mod1_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 + top2_ratio
                                 + Site_type
                                 + dm_wind_velocity
                                 + dm_temperature
                                 + Days_since_start
                                 + Plot_Cover_T
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod1_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + top2_ratio +  
##     Site_type + dm_wind_velocity + dm_temperature + Days_since_start +  
##     Plot_Cover_T + (1 | site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    247.1    265.2   -113.6    227.1       35 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.47797 -0.60775 -0.05869  0.61845  1.59765 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              2.611513   0.066489  39.277  < 2e-16 ***
## Floral_simpson_index_T  -0.090140   0.060380  -1.493   0.1355    
## minutes_since_9am       -0.089149   0.046420  -1.920   0.0548 .  
## top2_ratio               0.058592   0.042452   1.380   0.1675    
## Site_typeyoung_restored  0.015021   0.118110   0.127   0.8988    
## dm_wind_velocity         0.002427   0.066070   0.037   0.9707    
## dm_temperature          -0.007824   0.071469  -0.109   0.9128    
## Days_since_start        -0.218909   0.053857  -4.065 4.81e-05 ***
## Plot_Cover_T            -0.047777   0.065689  -0.727   0.4670    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt St_ty_ dm_wn_ dm_tmp Dys_s_
## Flrl_smp__T -0.126                                                 
## mnts_snc_9m  0.268  0.158                                          
## top2_ratio  -0.093  0.079 -0.161                                   
## St_typyng_r -0.790  0.208 -0.279  0.072                            
## dm_wnd_vlct  0.329  0.078  0.199 -0.102 -0.414                     
## dm_tempertr  0.529 -0.016  0.247  0.016 -0.664  0.685              
## Dys_snc_str  0.202  0.063  0.293  0.012 -0.121 -0.341 -0.050       
## Plot_Covr_T -0.369  0.606  0.049 -0.144  0.490 -0.263 -0.477  0.045
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod1_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  |  Log-Mean |   SE |         95% CI |     z |      p
## -------------------------------------------------------------------------------
## (Intercept)                |      2.61 | 0.07 | [ 2.48,  2.74] | 39.28 | < .001
## Floral simpson index T     |     -0.09 | 0.06 | [-0.21,  0.03] | -1.49 | 0.135 
## minutes since 9am          |     -0.09 | 0.05 | [-0.18,  0.00] | -1.92 | 0.055 
## top2 ratio                 |      0.06 | 0.04 | [-0.02,  0.14] |  1.38 | 0.168 
## Site type [young_restored] |      0.02 | 0.12 | [-0.22,  0.25] |  0.13 | 0.899 
## dm wind velocity           |  2.43e-03 | 0.07 | [-0.13,  0.13] |  0.04 | 0.971 
## dm temperature             | -7.82e-03 | 0.07 | [-0.15,  0.13] | -0.11 | 0.913 
## Days since start           |     -0.22 | 0.05 | [-0.32, -0.11] | -4.06 | < .001
## Plot Cover T               |     -0.05 | 0.07 | [-0.18,  0.08] | -0.73 | 0.467 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.06 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_rich_mod1_poiss)
## [1] TRUE
#check the model
check_model(netting_rich_mod1_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod1_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.755
##   Pearson's Chi-Squared = 26.427
##                 p-value =  0.851
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod1_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.09 [1.57, 3.07]         1.44      0.48
##       minutes_since_9am 1.33 [1.11, 2.00]         1.15      0.75
##              top2_ratio 1.17 [1.03, 1.96]         1.08      0.85
##               Site_type 2.15 [1.61, 3.16]         1.46      0.47
##        dm_wind_velocity 2.50 [1.83, 3.69]         1.58      0.40
##          dm_temperature 3.27 [2.33, 4.89]         1.81      0.31
##        Days_since_start 1.46 [1.19, 2.16]         1.21      0.68
##            Plot_Cover_T 2.73 [1.98, 4.04]         1.65      0.37
##  Tolerance 95% CI
##      [0.33, 0.64]
##      [0.50, 0.90]
##      [0.51, 0.97]
##      [0.32, 0.62]
##      [0.27, 0.55]
##      [0.20, 0.43]
##      [0.46, 0.84]
##      [0.25, 0.51]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod1_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod1_poiss)
plot(netting_rich_mod1_poiss_sim_res)

# full model with family richness as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# removing dm_temperature (p= 0.955    for netting_rich_mod1_poiss)

netting_rich_mod2_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 + top2_ratio
                                 + Site_type
                                 + dm_wind_velocity
                                 #+ dm_temperature
                                 + Days_since_start
                                 + Plot_Cover_T
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod2_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + top2_ratio +  
##     Site_type + dm_wind_velocity + Days_since_start + Plot_Cover_T +  
##     (1 | site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    245.2    261.4   -113.6    227.2       36 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.46546 -0.61201 -0.05741  0.62537  1.64075 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              2.615347   0.056437  46.341  < 2e-16 ***
## Floral_simpson_index_T  -0.090239   0.060354  -1.495   0.1349    
## minutes_since_9am       -0.087894   0.044989  -1.954   0.0507 .  
## top2_ratio               0.058670   0.042461   1.382   0.1671    
## Site_typeyoung_restored  0.006449   0.088341   0.073   0.9418    
## dm_wind_velocity         0.007378   0.048230   0.153   0.8784    
## Days_since_start        -0.219203   0.053805  -4.074 4.62e-05 ***
## Plot_Cover_T            -0.051207   0.057763  -0.887   0.3753    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt St_ty_ dm_wn_ Dys_s_
## Flrl_smp__T -0.138                                          
## mnts_snc_9m  0.167  0.167                                   
## top2_ratio  -0.119  0.079 -0.171                            
## St_typyng_r -0.692  0.264 -0.159  0.110                     
## dm_wnd_vlct -0.052  0.122  0.042 -0.155  0.072              
## Dys_snc_str  0.269  0.062  0.315  0.014 -0.206 -0.422       
## Plot_Covr_T -0.156  0.681  0.196 -0.156  0.263  0.098  0.024
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod2_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     2.62 | 0.06 | [ 2.50,  2.73] | 46.34 | < .001
## Floral simpson index T     |    -0.09 | 0.06 | [-0.21,  0.03] | -1.50 | 0.135 
## minutes since 9am          |    -0.09 | 0.04 | [-0.18,  0.00] | -1.95 | 0.051 
## top2 ratio                 |     0.06 | 0.04 | [-0.02,  0.14] |  1.38 | 0.167 
## Site type [young_restored] | 6.45e-03 | 0.09 | [-0.17,  0.18] |  0.07 | 0.942 
## dm wind velocity           | 7.38e-03 | 0.05 | [-0.09,  0.10] |  0.15 | 0.878 
## Days since start           |    -0.22 | 0.05 | [-0.32, -0.11] | -4.07 | < .001
## Plot Cover T               |    -0.05 | 0.06 | [-0.16,  0.06] | -0.89 | 0.375 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.06 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_rich_mod2_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod2_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.735
##   Pearson's Chi-Squared = 26.457
##                 p-value =  0.877
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod2_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.08 [1.56, 3.10]         1.44      0.48
##       minutes_since_9am 1.25 [1.06, 1.97]         1.12      0.80
##              top2_ratio 1.17 [1.03, 2.01]         1.08      0.85
##               Site_type 1.20 [1.04, 1.98]         1.10      0.83
##        dm_wind_velocity 1.33 [1.10, 2.03]         1.15      0.75
##        Days_since_start 1.46 [1.18, 2.18]         1.21      0.68
##            Plot_Cover_T 2.11 [1.57, 3.13]         1.45      0.47
##  Tolerance 95% CI
##      [0.32, 0.64]
##      [0.51, 0.94]
##      [0.50, 0.97]
##      [0.51, 0.96]
##      [0.49, 0.91]
##      [0.46, 0.85]
##      [0.32, 0.64]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod2_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod2_poiss)
plot(netting_rich_mod2_poiss_sim_res)

#removing site type (p= 0.976     for netting_rich_mod2_poiss)

netting_rich_mod3_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 + top2_ratio
                                 #+ Site_type
                                 + dm_wind_velocity
                                 #+ dm_temperature
                                 + Days_since_start
                                 + Plot_Cover_T
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod3_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + top2_ratio +  
##     dm_wind_velocity + Days_since_start + Plot_Cover_T + (1 |      site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    243.2    257.6   -113.6    227.2       37 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4746 -0.6098 -0.0616  0.6332  1.6327 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             2.618196   0.040724  64.292  < 2e-16 ***
## Floral_simpson_index_T -0.091401   0.058240  -1.569   0.1166    
## minutes_since_9am      -0.087371   0.044427  -1.967   0.0492 *  
## top2_ratio              0.058328   0.042202   1.382   0.1669    
## dm_wind_velocity        0.007126   0.048068   0.148   0.8821    
## Days_since_start       -0.218394   0.052659  -4.147 3.36e-05 ***
## Plot_Cover_T           -0.052313   0.055741  -0.939   0.3480    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt dm_wn_ Dys_s_
## Flrl_smp__T  0.064                                   
## mnts_snc_9m  0.080  0.220                            
## top2_ratio  -0.061  0.052 -0.156                     
## dm_wnd_vlct -0.003  0.107  0.055 -0.164              
## Dys_snc_str  0.179  0.124  0.292  0.037 -0.417       
## Plot_Covr_T  0.038  0.657  0.251 -0.193  0.083  0.083
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod3_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |     2.62 | 0.04 | [ 2.54,  2.70] | 64.29 | < .001
## Floral simpson index T |    -0.09 | 0.06 | [-0.21,  0.02] | -1.57 | 0.117 
## minutes since 9am      |    -0.09 | 0.04 | [-0.17,  0.00] | -1.97 | 0.049 
## top2 ratio             |     0.06 | 0.04 | [-0.02,  0.14] |  1.38 | 0.167 
## dm wind velocity       | 7.13e-03 | 0.05 | [-0.09,  0.10] |  0.15 | 0.882 
## Days since start       |    -0.22 | 0.05 | [-0.32, -0.12] | -4.15 | < .001
## Plot Cover T           |    -0.05 | 0.06 | [-0.16,  0.06] | -0.94 | 0.348 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.06 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_rich_mod3_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod3_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.715
##   Pearson's Chi-Squared = 26.453
##                 p-value =  0.901
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod3_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.94 [1.46, 2.92]         1.39      0.52
##       minutes_since_9am 1.22 [1.05, 2.01]         1.10      0.82
##              top2_ratio 1.16 [1.02, 2.10]         1.08      0.86
##        dm_wind_velocity 1.32 [1.10, 2.05]         1.15      0.76
##        Days_since_start 1.40 [1.14, 2.14]         1.18      0.72
##            Plot_Cover_T 1.96 [1.47, 2.95]         1.40      0.51
##  Tolerance 95% CI
##      [0.34, 0.68]
##      [0.50, 0.96]
##      [0.48, 0.98]
##      [0.49, 0.91]
##      [0.47, 0.88]
##      [0.34, 0.68]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod3_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod3_poiss)
plot(netting_rich_mod3_poiss_sim_res)

# removing wind velocity (p= 0.813   ,  for netting_rich_mod3_poiss)
netting_rich_mod4_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 + top2_ratio
                                 #+ Site_type
                                 #+ dm_wind_velocity
                                 #+ dm_temperature
                                 + Days_since_start
                                 + Plot_Cover_T
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod4_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + top2_ratio +  
##     Days_since_start + Plot_Cover_T + (1 | site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    241.2    253.8   -113.6    227.2       38 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.43293 -0.61796 -0.06102  0.67209  1.63117 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             2.61820    0.04072  64.291   <2e-16 ***
## Floral_simpson_index_T -0.09233    0.05790  -1.595   0.1108    
## minutes_since_9am      -0.08773    0.04435  -1.978   0.0479 *  
## top2_ratio              0.05935    0.04162   1.426   0.1539    
## Days_since_start       -0.21515    0.04788  -4.494    7e-06 ***
## Plot_Cover_T           -0.05299    0.05554  -0.954   0.3400    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt Dys_s_
## Flrl_smp__T  0.064                            
## mnts_snc_9m  0.080  0.217                     
## top2_ratio  -0.062  0.072 -0.150              
## Dys_snc_str  0.195  0.186  0.346 -0.036       
## Plot_Covr_T  0.038  0.654  0.248 -0.182  0.130
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod4_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |     2.62 | 0.04 | [ 2.54,  2.70] | 64.29 | < .001
## Floral simpson index T |    -0.09 | 0.06 | [-0.21,  0.02] | -1.59 | 0.111 
## minutes since 9am      |    -0.09 | 0.04 | [-0.17,  0.00] | -1.98 | 0.048 
## top2 ratio             |     0.06 | 0.04 | [-0.02,  0.14] |  1.43 | 0.154 
## Days since start       |    -0.22 | 0.05 | [-0.31, -0.12] | -4.49 | < .001
## Plot Cover T           |    -0.05 | 0.06 | [-0.16,  0.06] | -0.95 | 0.340 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.06 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_rich_mod4_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod4_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.697
##   Pearson's Chi-Squared = 26.488
##                 p-value =   0.92
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod4_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.92 [1.44, 2.92]         1.38      0.52
##       minutes_since_9am 1.21 [1.04, 2.05]         1.10      0.82
##              top2_ratio 1.13 [1.01, 2.34]         1.06      0.89
##        Days_since_start 1.15 [1.02, 2.17]         1.07      0.87
##            Plot_Cover_T 1.95 [1.46, 2.97]         1.40      0.51
##  Tolerance 95% CI
##      [0.34, 0.69]
##      [0.49, 0.96]
##      [0.43, 0.99]
##      [0.46, 0.98]
##      [0.34, 0.69]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod4_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod4_poiss)
plot(netting_rich_mod4_poiss_sim_res)

# removing Plot Cover T  (p= 0.418  ,  for netting_rich_mod4_poiss)
netting_rich_mod5_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 + top2_ratio
                                 #+ Site_type
                                 #+ dm_wind_velocity
                                 #+ dm_temperature
                                 + Days_since_start
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod5_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + top2_ratio +  
##     Days_since_start + (1 | site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    240.1    250.9   -114.1    228.1       39 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.73418 -0.65597  0.05139  0.68637  1.30260 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             2.61902    0.04069  64.367  < 2e-16 ***
## Floral_simpson_index_T -0.05583    0.04330  -1.290   0.1972    
## minutes_since_9am      -0.07743    0.04311  -1.796   0.0724 .  
## top2_ratio              0.05182    0.04074   1.272   0.2033    
## Days_since_start       -0.20896    0.04743  -4.406 1.05e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt
## Flrl_smp__T  0.051                     
## mnts_snc_9m  0.073  0.067              
## top2_ratio  -0.055  0.266 -0.118       
## Dys_snc_str  0.192  0.135  0.324 -0.010
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod5_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |     2.62 | 0.04 | [ 2.54,  2.70] | 64.37 | < .001
## Floral simpson index T |    -0.06 | 0.04 | [-0.14,  0.03] | -1.29 | 0.197 
## minutes since 9am      |    -0.08 | 0.04 | [-0.16,  0.01] | -1.80 | 0.072 
## top2 ratio             |     0.05 | 0.04 | [-0.03,  0.13] |  1.27 | 0.203 
## Days since start       |    -0.21 | 0.05 | [-0.30, -0.12] | -4.41 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.07 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_rich_mod5_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod5_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.693
##   Pearson's Chi-Squared = 27.042
##                 p-value =  0.926
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod5_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.10 [1.01, 2.83]         1.05      0.91
##       minutes_since_9am 1.14 [1.01, 2.33]         1.07      0.88
##              top2_ratio 1.10 [1.00, 2.94]         1.05      0.91
##        Days_since_start 1.13 [1.01, 2.37]         1.06      0.88
##  Tolerance 95% CI
##      [0.35, 0.99]
##      [0.43, 0.99]
##      [0.34, 1.00]
##      [0.42, 0.99]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod5_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod5_poiss)
plot(netting_rich_mod5_poiss_sim_res)

# removing top2 ratio(p= 0.279   ,  for netting_rich_mod5_poiss)
netting_rich_mod6_poiss <- glmer(unique_taxa 
                                 ~Floral_simpson_index_T 
                                 + minutes_since_9am
                                 # top2_ratio
                                 #+ Site_type
                                 #+ dm_wind_velocity
                                 #+ dm_temperature
                                 + Days_since_start
                                 + (1|site), 
                                 data = netting_richness, family = "poisson")
## boundary (singular) fit: see help('isSingular')
summary(netting_rich_mod6_poiss)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## unique_taxa ~ Floral_simpson_index_T + minutes_since_9am + Days_since_start +  
##     (1 | site)
##    Data: netting_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    239.7    248.7   -114.9    229.7       40 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.47165 -0.70151  0.04811  0.66932  1.52514 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0        0       
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             2.62062    0.04061  64.532  < 2e-16 ***
## Floral_simpson_index_T -0.06972    0.04151  -1.680   0.0930 .  
## minutes_since_9am      -0.07113    0.04265  -1.668   0.0954 .  
## Days_since_start       -0.20838    0.04685  -4.448 8.68e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9
## Flrl_smp__T 0.068               
## mnts_snc_9m 0.067  0.110        
## Dys_snc_str 0.189  0.143  0.329 
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(netting_rich_mod6_poiss)
## Your model may suffer from singularity (see see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the standard errors and confidence intervals of the random
##   effects parameters are probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |     2.62 | 0.04 | [ 2.54,  2.70] | 64.53 | < .001
## Floral simpson index T |    -0.07 | 0.04 | [-0.15,  0.01] | -1.68 | 0.093 
## minutes since 9am      |    -0.07 | 0.04 | [-0.15,  0.01] | -1.67 | 0.095 
## Days since start       |    -0.21 | 0.05 | [-0.30, -0.12] | -4.45 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE | CI_low
## --------------------------------------------------
## SD (Intercept: site) |        0.00 | 0.11 |   0.00
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(netting_rich_mod6_poiss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_rich_mod6_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.714
##   Pearson's Chi-Squared = 28.574
##                 p-value =  0.911
## No overdispersion detected.
#collinearity
check_collinearity(netting_rich_mod6_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF     VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.03 [1.00, 785.29]         1.01      0.98
##       minutes_since_9am 1.13 [1.01,   2.53]         1.06      0.89
##        Days_since_start 1.14 [1.01,   2.43]         1.07      0.88
##  Tolerance 95% CI
##      [0.00, 1.00]
##      [0.40, 0.99]
##      [0.41, 0.99]
# dharma package - simulate residuals and check model assumptions
netting_rich_mod6_poiss_sim_res <- simulateResiduals(fittedModel = netting_rich_mod6_poiss)
plot(netting_rich_mod6_poiss_sim_res)

IV.B.3.a. Compare the models with the performance package

# Compare the models with the performance package
netting_rich_pois_comp1 <- compare_performance(netting_rich_mod1_poiss, netting_rich_mod2_poiss, netting_rich_mod3_poiss, netting_rich_mod4_poiss, netting_rich_mod5_poiss, netting_rich_mod6_poiss,
                                        metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))

# Print the comparison table
print(netting_rich_pois_comp1)
## # Comparison of Model Performance Indices
## 
## Name                    |    Model | AICc (weights) | BIC (weights) |  RMSE
## ---------------------------------------------------------------------------
## netting_rich_mod1_poiss | glmerMod |  253.6 (0.001) | 265.2 (<.001) | 2.817
## netting_rich_mod2_poiss | glmerMod |  250.3 (0.006) | 261.4 (0.001) | 2.818
## netting_rich_mod3_poiss | glmerMod |  247.2 (0.028) | 257.6 (0.008) | 2.820
## netting_rich_mod4_poiss | glmerMod |  244.2 (0.121) | 253.8 (0.055) | 2.822
## netting_rich_mod5_poiss | glmerMod |  242.3 (0.311) | 250.9 (0.233) | 2.874
## netting_rich_mod6_poiss | glmerMod |  241.2 (0.533) | 248.7 (0.703) | 2.930

IV.B.3.b. Visualize the model results

#plot_model(netting_rich_mod1_poiss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_rich_mod2_poiss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_rich_mod3_poiss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_rich_mod4_poiss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_rich_mod5_poiss, type = "est", show.values = TRUE, value.offset = 0.3)
plot_model(netting_rich_mod6_poiss, type = "est", show.values = TRUE, value.offset = 0.3) 

(mod6 <-plot_model(netting_rich_mod6_poiss, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Days Since Start",
                           "Time of Day",
                           "Floral Simpson Index")) +
    labs(title = "Transect walk: Species Richness",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 1)))  # 0 = left, 1 = right

## days since start netting_rich_mod1_poiss ---------
# Get the original mean and SD of days since start before scaling
days_mean <- mean(envir_data$Days_since_start, na.rm = TRUE)
days_sd <- sd(envir_data$Days_since_start, na.rm = TRUE)
# Get predictions on the scaled variable
pred_days <- ggpredict(netting_rich_mod1_poiss , terms = "Days_since_start")
# Unscale the x-axis
pred_days$x_unscaled <- (pred_days$x * days_sd) + days_mean
# Plot
(days_rich_net <- ggplot(pred_days, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Days_since_start"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Days_since_start"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Insect Richness vs Days Since Start",
    x = "Days Since Start",
    y = "Predicted Count of unique taxa"
  ))

## minutes since 9am index netting_rich_mod1_poiss ---------
# Get the original mean and SD of minutes since 9am before scaling
minutes_mean <- mean(envir_data$minutes_since_9am, na.rm = TRUE)
minutes_sd <- sd(envir_data$minutes_since_9am, na.rm = TRUE)
# Get predictions on the scaled variable
pred_minutes <- ggpredict(netting_rich_mod1_poiss , terms = "minutes_since_9am")
# Unscale the x-axis
pred_minutes$x_unscaled <- (pred_minutes$x * minutes_sd) + minutes_mean
# divide by 60 to get hours, and add 9 to get hour of the day
pred_minutes$x_unscaled <- (pred_minutes$x_unscaled / 60) + 9

# Plot
ggplot(pred_minutes, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["minutes_since_9am"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["minutes_since_9am"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Insect Richness vs Time of day",
    x = "Time of day",
    y = "Predicted Count of unique taxa caught during transect walk"
  ) 

#save combined plots mod6 and days_rich_net
library(cowplot)
#change title of mod6
mod6 <- mod6 + labs(title = "Species Richness")+
  theme(axis.title = element_text(size = 12))
days_rich_net <- days_rich_net + labs(title = "Species Richness vs Days Since Start")+
  #change font size
  theme(axis.title = element_text(size = 12))


# Step 1: Combine the plots and label ONLY them (A and B)
combined_plots <- cowplot::plot_grid(
  mod6, 
  days_rich_net, 
  ncol = 2,
  labels = c("A", "B"),   # Label just these two plots
  label_size = 12, 
  rel_widths = c(1.2, 1) 
)

# Step 2: Add the title separately, so it doesn't get a label
(final_plot <- cowplot::plot_grid(
  ggdraw() + draw_label(
    "Transect walk: Species Richness", 
    fontface = 'bold', size = 12, x = 0.5, hjust = 0.5
  ),
  combined_plots,
  ncol = 1,
  rel_heights = c(0.1, 1)  # Title height vs. plots height
))

# Save the combined plot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/days_rich_net.png", plot = final_plot, 
       #size to match an a4 page      
       width = 20, height = 10, units = "cm", 
       dpi = 300, device = "png")

IV.B.3. c. Interpretation of results

#removal of unnecessary objects - all starting with netting_
rm(list = ls(pattern = "^netting_"))

IV.B.4. NETTING Shannon Index - Gaussian

# 1. First, let's aggregate by 'site' and 'transect' and calculate the counts per insect order
netting_agg <- netting %>%
  group_by(site, transect, lowest_taxa) %>%
  summarise(count = n(), .groups = 'drop')  # Count the occurrences of each order

# 2. Pivot the data into a wider format, where each column is an insect order and the values are counts
netting_wide <- netting_agg %>%
  pivot_wider(names_from = lowest_taxa, values_from = count, values_fill = list(count = 0))

# 3. Compute Shannon diversity index for each transect per site using the 'vegan::diversity()' function
# Apply the diversity function to each row, excluding the site and transect identifiers
netting_div <- netting_wide %>%
  rowwise() %>%
  mutate(shannon_diversity = diversity(c_across(3:106), index = "shannon"),
    simpson_diversity = diversity(c_across(3:106), index = "simpson")
    ) %>%
  ungroup()%>%
  #remove all the columns with the counts
  dplyr::select(-c(3:106)) 

netting_diversity <- netting_div %>%
  #join the scaled_envir_data 
  left_join(scaled_envir_data, by = c("site"="Site", "transect"="Transect"))

netting_div_unscaled <- netting_div %>%
  #join the scaled_envir_data 
  left_join(envir_data, by = c("site"="Site", "transect"="Transect"))

# View the new data frame with Shannon diversity values
head(netting_diversity)
## # A tibble: 6 × 23
##   site  transect shannon_diversity simpson_diversity Date      
##   <chr> <chr>                <dbl>             <dbl> <date>    
## 1 BUH   T1                    2.03             0.86  2024-08-13
## 2 BUH   T2                    2.14             0.864 2024-08-13
## 3 BUH   T3                    1.89             0.765 2024-08-13
## 4 BUH   T4                    1.58             0.741 2024-08-13
## 5 BUH   T5                    1.56             0.778 2024-08-13
## 6 DES   T1                    2.59             0.894 2024-07-21
## # ℹ 18 more variables: minutes_since_9am <dbl[,1]>, dm_wind_velocity <dbl[,1]>,
## #   dm_temperature <dbl[,1]>, agri <dbl[,1]>, grass <dbl[,1]>, snh <dbl[,1]>,
## #   forest <dbl[,1]>, urban <dbl[,1]>, water <dbl[,1]>, majority_class <chr>,
## #   Pastinaca.sativa <dbl[,1]>, Daucus.carota <dbl[,1]>, top2_ratio <dbl[,1]>,
## #   average_flower_cover <dbl[,1]>, Plot_Cover_T <dbl[,1]>, Site_type <chr>,
## #   Floral_simpson_index_T <dbl[,1]>, Days_since_start <dbl[,1]>
#remove intermediate data frames
rm(netting_agg, netting_wide, netting_div)

#histogram of shannon diversity, binwidth =0.1
netting_diversity %>%
  ggplot(aes(x = shannon_diversity)) +
  geom_histogram(binwidth = 0.05, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Shannon Diversity Index",
       x = "Shannon Diversity Index",
       y = "Count")

#testing the normality of the shannon index
shapiro.test(netting_diversity$shannon_diversity) # p-value = 0.1578, shannon index is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  netting_diversity$shannon_diversity
## W = 0.96292, p-value = 0.1578
#testing skewness
datawizard::describe_distribution(netting_diversity$shannon_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 2.30 | 0.31 | 0.44 | [1.56, 2.89] |    -0.55 |    -0.23 | 45 |         0

The shannon index is normally distributed (shapiro test: p-value = 0.1578) and has a skewness of -0.23, indicating a slight left skew. The kurtosis is 2.4, indicating a platykurtic distribution (flatter than normal), but fairly normal. Since the shannon index is a continuous variable, we cannot use a poisson model that expects count based integers, but we can use a Gaussian distribution (aka normal distribution) that is better modeled with a linear mixed model (lmm).

# full model with shannon index as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# Poisson distribution
netting_shannon_mod1_gauss <- lmer(shannon_diversity 
                                   ~Floral_simpson_index_T 
                                   + minutes_since_9am
                                   + top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   + Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)
summary(netting_shannon_mod1_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     top2_ratio + Site_type + dm_wind_velocity + dm_temperature +  
##     Days_since_start + Plot_Cover_T + (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 31
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8172 -0.3679  0.1246  0.3590  1.6068 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01704  0.1305  
##  Residual             0.05497  0.2345  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                          Estimate Std. Error t value
## (Intercept)              2.283066   0.087287  26.156
## Floral_simpson_index_T  -0.007236   0.052801  -0.137
## minutes_since_9am       -0.083953   0.044132  -1.902
## top2_ratio               0.002639   0.043303   0.061
## Site_typeyoung_restored  0.027872   0.151007   0.185
## dm_wind_velocity         0.103656   0.085229   1.216
## dm_temperature           0.051423   0.089900   0.572
## Days_since_start        -0.255800   0.069729  -3.668
## Plot_Cover_T            -0.007696   0.059455  -0.129
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 tp2_rt St_ty_ dm_wn_ dm_tmp Dys_s_
## Flrl_smp__T -0.105                                                 
## mnts_snc_9m  0.165  0.153                                          
## top2_ratio  -0.033  0.077 -0.070                                   
## St_typyng_r -0.769  0.137 -0.215  0.043                            
## dm_wnd_vlct  0.268  0.055  0.148 -0.030 -0.348                     
## dm_tempertr  0.462 -0.005  0.206  0.056 -0.601  0.639              
## Dys_snc_str  0.121  0.041  0.206  0.058 -0.157 -0.423 -0.094       
## Plot_Covr_T -0.259  0.592  0.008 -0.184  0.336 -0.194 -0.355  0.022
parameters(netting_shannon_mod1_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(34) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        2.28 | 0.09 | [ 2.11,  2.46] | 26.16 | < .001
## Floral simpson index T     |   -7.24e-03 | 0.05 | [-0.11,  0.10] | -0.14 | 0.892 
## minutes since 9am          |       -0.08 | 0.04 | [-0.17,  0.01] | -1.90 | 0.066 
## top2 ratio                 |    2.64e-03 | 0.04 | [-0.09,  0.09] |  0.06 | 0.952 
## Site type [young_restored] |        0.03 | 0.15 | [-0.28,  0.33] |  0.18 | 0.855 
## dm wind velocity           |        0.10 | 0.09 | [-0.07,  0.28] |  1.22 | 0.232 
## dm temperature             |        0.05 | 0.09 | [-0.13,  0.23] |  0.57 | 0.571 
## Days since start           |       -0.26 | 0.07 | [-0.40, -0.11] | -3.67 | < .001
## Plot Cover T               |   -7.70e-03 | 0.06 | [-0.13,  0.11] | -0.13 | 0.898 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.13 | 0.08 | [0.04, 0.46]
## SD (Residual)        |        0.23 | 0.03 | [0.18, 0.30]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(netting_shannon_mod1_gauss)
## [1] FALSE
#check the model
check_model(netting_shannon_mod1_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod1_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.738
##           p-value =  0.24
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod1_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.85 [1.42, 2.71]         1.36      0.54
##       minutes_since_9am 1.17 [1.03, 1.96]         1.08      0.85
##              top2_ratio 1.14 [1.02, 2.04]         1.07      0.88
##               Site_type 1.81 [1.40, 2.65]         1.34      0.55
##        dm_wind_velocity 2.28 [1.69, 3.36]         1.51      0.44
##          dm_temperature 2.54 [1.86, 3.75]         1.59      0.39
##        Days_since_start 1.53 [1.22, 2.24]         1.24      0.66
##            Plot_Cover_T 2.11 [1.58, 3.10]         1.45      0.47
##  Tolerance 95% CI
##      [0.37, 0.70]
##      [0.51, 0.97]
##      [0.49, 0.98]
##      [0.38, 0.72]
##      [0.30, 0.59]
##      [0.27, 0.54]
##      [0.45, 0.82]
##      [0.32, 0.63]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod1_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod1_gauss)
plot(netting_shannon_mod1_gauss_sim_res)

# full model with shannon index as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# removing top2_ratio (p= 0.940   for netting_shannon_mod1_gauss)

netting_shannon_mod2_gauss <- lmer(shannon_diversity 
                                   ~Floral_simpson_index_T 
                                   + minutes_since_9am
                                   #+ top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   + Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)
summary(netting_shannon_mod2_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     Site_type + dm_wind_velocity + dm_temperature + Days_since_start +  
##     Plot_Cover_T + (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 26.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8604 -0.3695  0.1249  0.3795  1.6370 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01628  0.1276  
##  Residual             0.05356  0.2314  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                          Estimate Std. Error t value
## (Intercept)              2.283171   0.085652  26.656
## Floral_simpson_index_T  -0.007682   0.051951  -0.148
## minutes_since_9am       -0.083894   0.043419  -1.932
## Site_typeyoung_restored  0.027634   0.148178   0.186
## dm_wind_velocity         0.103687   0.083641   1.240
## dm_temperature           0.050954   0.088181   0.578
## Days_since_start        -0.256080   0.068325  -3.748
## Plot_Cover_T            -0.007007   0.057682  -0.121
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 St_ty_ dm_wn_ dm_tmp Dys_s_
## Flrl_smp__T -0.104                                          
## mnts_snc_9m  0.164  0.159                                   
## St_typyng_r -0.769  0.135 -0.213                            
## dm_wnd_vlct  0.268  0.058  0.147 -0.348                     
## dm_tempertr  0.466 -0.010  0.211 -0.606  0.643              
## Dys_snc_str  0.123  0.037  0.212 -0.160 -0.422 -0.098       
## Plot_Covr_T -0.271  0.619 -0.005  0.352 -0.204 -0.353  0.033
parameters(netting_shannon_mod2_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(35) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        2.28 | 0.09 | [ 2.11,  2.46] | 26.66 | < .001
## Floral simpson index T     |   -7.68e-03 | 0.05 | [-0.11,  0.10] | -0.15 | 0.883 
## minutes since 9am          |       -0.08 | 0.04 | [-0.17,  0.00] | -1.93 | 0.061 
## Site type [young_restored] |        0.03 | 0.15 | [-0.27,  0.33] |  0.19 | 0.853 
## dm wind velocity           |        0.10 | 0.08 | [-0.07,  0.27] |  1.24 | 0.223 
## dm temperature             |        0.05 | 0.09 | [-0.13,  0.23] |  0.58 | 0.567 
## Days since start           |       -0.26 | 0.07 | [-0.39, -0.12] | -3.75 | < .001
## Plot Cover T               |   -7.01e-03 | 0.06 | [-0.12,  0.11] | -0.12 | 0.904 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.13 | 0.08 | [0.04, 0.43]
## SD (Residual)        |        0.23 | 0.03 | [0.18, 0.29]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(netting_shannon_mod2_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod2_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.762
##           p-value = 0.264
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod2_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.84 [1.41, 2.73]         1.36      0.54
##       minutes_since_9am 1.17 [1.03, 2.02]         1.08      0.86
##               Site_type 1.81 [1.39, 2.68]         1.34      0.55
##        dm_wind_velocity 2.28 [1.68, 3.40]         1.51      0.44
##          dm_temperature 2.54 [1.84, 3.80]         1.59      0.39
##        Days_since_start 1.52 [1.22, 2.27]         1.23      0.66
##            Plot_Cover_T 2.04 [1.53, 3.04]         1.43      0.49
##  Tolerance 95% CI
##      [0.37, 0.71]
##      [0.49, 0.97]
##      [0.37, 0.72]
##      [0.29, 0.59]
##      [0.26, 0.54]
##      [0.44, 0.82]
##      [0.33, 0.65]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod2_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod2_gauss)
plot(netting_shannon_mod2_gauss_sim_res)

#remove Plot cover T (p= 0.904    for netting_shannon_mod2_gauss)
netting_shannon_mod3_gauss <- lmer(shannon_diversity 
                                   ~Floral_simpson_index_T 
                                   + minutes_since_9am
                                   #+ top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)

summary(netting_shannon_mod3_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     Site_type + dm_wind_velocity + dm_temperature + Days_since_start +  
##     (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 22.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8941 -0.3726  0.1167  0.3777  1.6538 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01650  0.1284  
##  Residual             0.05203  0.2281  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                          Estimate Std. Error t value
## (Intercept)              2.280484   0.082285  27.714
## Floral_simpson_index_T  -0.003322   0.040252  -0.083
## minutes_since_9am       -0.083641   0.042867  -1.951
## Site_typeyoung_restored  0.033680   0.138364   0.243
## dm_wind_velocity         0.101864   0.081676   1.247
## dm_temperature           0.047497   0.082282   0.577
## Days_since_start        -0.255729   0.068141  -3.753
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T mnt__9 St_ty_ dm_wn_ dm_tmp
## Flrl_smp__T  0.083                                   
## mnts_snc_9m  0.167  0.207                            
## St_typyng_r -0.747 -0.112 -0.224                     
## dm_wnd_vlct  0.225  0.237  0.147 -0.301              
## dm_tempertr  0.410  0.280  0.222 -0.549  0.622       
## Dys_snc_str  0.136  0.020  0.210 -0.183 -0.426 -0.093
parameters(netting_shannon_mod3_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(36) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        2.28 | 0.08 | [ 2.11,  2.45] | 27.71 | < .001
## Floral simpson index T     |   -3.32e-03 | 0.04 | [-0.08,  0.08] | -0.08 | 0.935 
## minutes since 9am          |       -0.08 | 0.04 | [-0.17,  0.00] | -1.95 | 0.059 
## Site type [young_restored] |        0.03 | 0.14 | [-0.25,  0.31] |  0.24 | 0.809 
## dm wind velocity           |        0.10 | 0.08 | [-0.06,  0.27] |  1.25 | 0.220 
## dm temperature             |        0.05 | 0.08 | [-0.12,  0.21] |  0.58 | 0.567 
## Days since start           |       -0.26 | 0.07 | [-0.39, -0.12] | -3.75 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.13 | 0.08 | [0.04, 0.42]
## SD (Residual)        |        0.23 | 0.03 | [0.18, 0.29]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(netting_shannon_mod3_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod3_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.778
##           p-value = 0.328
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod3_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.13 [1.01, 2.22]         1.06      0.88
##       minutes_since_9am 1.16 [1.02, 2.08]         1.08      0.86
##               Site_type 1.58 [1.24, 2.38]         1.26      0.63
##        dm_wind_velocity 2.18 [1.61, 3.29]         1.48      0.46
##          dm_temperature 2.21 [1.63, 3.34]         1.49      0.45
##        Days_since_start 1.52 [1.21, 2.29]         1.23      0.66
##  Tolerance 95% CI
##      [0.45, 0.99]
##      [0.48, 0.98]
##      [0.42, 0.80]
##      [0.30, 0.62]
##      [0.30, 0.61]
##      [0.44, 0.83]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod3_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod3_gauss)
plot(netting_shannon_mod3_gauss_sim_res)

#remove floral simpson index (p= 0.935     for netting_shannon_mod3_gauss)
netting_shannon_mod4_gauss <- lmer(shannon_diversity 
                                   #~Floral_simpson_index_T 
                                   ~ minutes_since_9am
                                   #+ top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)
summary(netting_shannon_mod4_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ minutes_since_9am + Site_type + dm_wind_velocity +  
##     dm_temperature + Days_since_start + (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 18.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9326 -0.3718  0.1293  0.3879  1.6786 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01683  0.1297  
##  Residual             0.05055  0.2248  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)              2.28111    0.08203  27.808
## minutes_since_9am       -0.08270    0.04142  -1.997
## Site_typeyoung_restored  0.03226    0.13752   0.235
## dm_wind_velocity         0.10351    0.07938   1.304
## dm_temperature           0.04947    0.07901   0.626
## Days_since_start        -0.25554    0.06814  -3.751
## 
## Correlation of Fixed Effects:
##             (Intr) mnt__9 St_ty_ dm_wn_ dm_tmp
## mnts_snc_9m  0.152                            
## St_typyng_r -0.745 -0.204                     
## dm_wnd_vlct  0.211  0.102 -0.284              
## dm_tempertr  0.404  0.172 -0.543  0.596       
## Dys_snc_str  0.134  0.207 -0.180 -0.444 -0.104
parameters(netting_shannon_mod4_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(37) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        2.28 | 0.08 | [ 2.11,  2.45] | 27.81 | < .001
## minutes since 9am          |       -0.08 | 0.04 | [-0.17,  0.00] | -2.00 | 0.053 
## Site type [young_restored] |        0.03 | 0.14 | [-0.25,  0.31] |  0.23 | 0.816 
## dm wind velocity           |        0.10 | 0.08 | [-0.06,  0.26] |  1.30 | 0.200 
## dm temperature             |        0.05 | 0.08 | [-0.11,  0.21] |  0.63 | 0.535 
## Days since start           |       -0.26 | 0.07 | [-0.39, -0.12] | -3.75 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.13 | 0.08 | [0.04, 0.41]
## SD (Residual)        |        0.22 | 0.03 | [0.18, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(netting_shannon_mod4_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod4_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.794
##           p-value = 0.416
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod4_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.11 [1.01, 2.55]         1.05      0.90     [0.39, 0.99]
##          Site_type 1.56 [1.23, 2.38]         1.25      0.64     [0.42, 0.82]
##   dm_wind_velocity 2.06 [1.52, 3.14]         1.43      0.49     [0.32, 0.66]
##     dm_temperature 2.04 [1.51, 3.11]         1.43      0.49     [0.32, 0.66]
##   Days_since_start 1.52 [1.20, 2.32]         1.23      0.66     [0.43, 0.83]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod4_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod4_gauss)
plot(netting_shannon_mod4_gauss_sim_res)

#remove site type (p= 0.816   for netting_shannon_mod4_gauss)
netting_shannon_mod5_gauss <- lmer(shannon_diversity 
                                   #~Floral_simpson_index_T 
                                   ~ minutes_since_9am
                                   #+ top2_ratio
                                   #+ Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)

summary(netting_shannon_mod5_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ minutes_since_9am + dm_wind_velocity + dm_temperature +  
##     Days_since_start + (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 15.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9166 -0.3920  0.1570  0.4592  1.7771 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01197  0.1094  
##  Residual             0.05045  0.2246  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                   Estimate Std. Error t value
## (Intercept)        2.29545    0.04951  46.366
## minutes_since_9am -0.08176    0.03983  -2.052
## dm_wind_velocity   0.10870    0.06889   1.578
## dm_temperature     0.05940    0.06009   0.989
## Days_since_start  -0.25296    0.06081  -4.160
## 
## Correlation of Fixed Effects:
##             (Intr) mnt__9 dm_wn_ dm_tmp
## mnts_snc_9m  0.000                     
## dm_wnd_vlct  0.000  0.051              
## dm_tempertr  0.000  0.081  0.549       
## Dys_snc_str  0.000  0.192 -0.522 -0.241
parameters(netting_shannon_mod5_gauss)
## # Fixed Effects
## 
## Parameter         | Coefficient |   SE |         95% CI | t(38) |      p
## ------------------------------------------------------------------------
## (Intercept)       |        2.30 | 0.05 | [ 2.20,  2.40] | 46.37 | < .001
## minutes since 9am |       -0.08 | 0.04 | [-0.16,  0.00] | -2.05 | 0.047 
## dm wind velocity  |        0.11 | 0.07 | [-0.03,  0.25] |  1.58 | 0.123 
## dm temperature    |        0.06 | 0.06 | [-0.06,  0.18] |  0.99 | 0.329 
## Days since start  |       -0.25 | 0.06 | [-0.38, -0.13] | -4.16 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.11 | 0.07 | [0.03, 0.36]
## SD (Residual)        |        0.22 | 0.03 | [0.18, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(netting_shannon_mod5_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod5_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.855
##           p-value =  0.56
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod5_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.08 [1.00, 4.13]         1.04      0.93     [0.24, 1.00]
##   dm_wind_velocity 1.89 [1.42, 2.92]         1.38      0.53     [0.34, 0.71]
##     dm_temperature 1.44 [1.15, 2.26]         1.20      0.69     [0.44, 0.87]
##   Days_since_start 1.48 [1.17, 2.30]         1.21      0.68     [0.43, 0.85]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod5_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod5_gauss)
plot(netting_shannon_mod5_gauss_sim_res)

#removing dm temperature (p= 0.329    for netting_shannon_mod5_gauss)
netting_shannon_mod6_gauss <- lmer(shannon_diversity 
                                   #~Floral_simpson_index_T 
                                   ~ minutes_since_9am
                                   #+ top2_ratio
                                   #+ Site_type
                                   + dm_wind_velocity
                                   #+ dm_temperature
                                   + Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | site), 
                                   data = netting_diversity)
summary(netting_shannon_mod6_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ minutes_since_9am + dm_wind_velocity + Days_since_start +  
##     (1 | site)
##    Data: netting_diversity
## 
## REML criterion at convergence: 13.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0721 -0.5007  0.1774  0.4519  1.7432 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.01179  0.1086  
##  Residual             0.05048  0.2247  
## Number of obs: 45, groups:  site, 9
## 
## Fixed effects:
##                   Estimate Std. Error t value
## (Intercept)        2.29545    0.04932  46.545
## minutes_since_9am -0.08503    0.03968  -2.143
## dm_wind_velocity   0.07132    0.05737   1.243
## Days_since_start  -0.23849    0.05880  -4.056
## 
## Correlation of Fixed Effects:
##             (Intr) mnt__9 dm_wn_
## mnts_snc_9m  0.000              
## dm_wnd_vlct  0.000  0.008       
## Dys_snc_str  0.000  0.219 -0.480
parameters(netting_shannon_mod6_gauss)
## # Fixed Effects
## 
## Parameter         | Coefficient |   SE |         95% CI | t(39) |      p
## ------------------------------------------------------------------------
## (Intercept)       |        2.30 | 0.05 | [ 2.20,  2.40] | 46.55 | < .001
## minutes since 9am |       -0.09 | 0.04 | [-0.17,  0.00] | -2.14 | 0.038 
## dm wind velocity  |        0.07 | 0.06 | [-0.04,  0.19] |  1.24 | 0.221 
## Days since start  |       -0.24 | 0.06 | [-0.36, -0.12] | -4.06 | < .001
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.11 | 0.06 | [0.04, 0.33]
## SD (Residual)        |        0.22 | 0.03 | [0.18, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(netting_shannon_mod6_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(netting_shannon_mod6_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.896
##           p-value = 0.712
## No overdispersion detected.
#collinearity
check_collinearity(netting_shannon_mod6_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.07 [1.00, 5.31]         1.03      0.94     [0.19, 1.00]
##   dm_wind_velocity 1.32 [1.09, 2.16]         1.15      0.76     [0.46, 0.92]
##   Days_since_start 1.39 [1.12, 2.23]         1.18      0.72     [0.45, 0.89]
# dharma package - simulate residuals and check model assumptions
netting_shannon_mod6_gauss_sim_res <- simulateResiduals(fittedModel = netting_shannon_mod6_gauss)
plot(netting_shannon_mod6_gauss_sim_res)

IV.B.4.a. Compare the models with the performance package

# Compare the models with the performance package
netting_shannon_gauss_comp1 <- compare_performance(netting_shannon_mod1_gauss, netting_shannon_mod2_gauss, netting_shannon_mod3_gauss, netting_shannon_mod4_gauss, netting_shannon_mod5_gauss,netting_shannon_mod6_gauss,
                                        metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(netting_shannon_gauss_comp1)
## # Comparison of Model Performance Indices
## 
## Name                       |   Model | AICc (weights) | BIC (weights)
## ---------------------------------------------------------------------
## netting_shannon_mod1_gauss | lmerMod |   25.5 (<.001) |  37.4 (<.001)
## netting_shannon_mod2_gauss | lmerMod |   22.0 (0.002) |  33.6 (<.001)
## netting_shannon_mod3_gauss | lmerMod |   18.8 (0.010) |  29.9 (0.003)
## netting_shannon_mod4_gauss | lmerMod |   15.8 (0.044) |  26.2 (0.020)
## netting_shannon_mod5_gauss | lmerMod |   12.1 (0.278) |  21.7 (0.198)
## netting_shannon_mod6_gauss | lmerMod |   10.3 (0.667) |  18.9 (0.778)
## 
## Name                       | R2 (cond.) | R2 (marg.) |   ICC |  RMSE
## --------------------------------------------------------------------
## netting_shannon_mod1_gauss |      0.525 |      0.377 | 0.237 | 0.203
## netting_shannon_mod2_gauss |      0.528 |      0.385 | 0.233 | 0.203
## netting_shannon_mod3_gauss |      0.536 |      0.389 | 0.241 | 0.203
## netting_shannon_mod4_gauss |      0.544 |      0.393 | 0.250 | 0.203
## netting_shannon_mod5_gauss |      0.525 |      0.413 | 0.192 | 0.205
## netting_shannon_mod6_gauss |      0.511 |      0.397 | 0.189 | 0.206

IV.B.4.b. Visualize the model results

#plot_model(netting_shannon_mod1_gauss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_shannon_mod2_gauss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_shannon_mod3_gauss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_shannon_mod4_gauss, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_shannon_mod5_gauss, type = "est", show.values = TRUE, value.offset = 0.3)
plot_model(netting_shannon_mod6_gauss, type = "est", show.values = TRUE, value.offset = 0.3) 

plot_model(netting_shannon_mod6_gauss, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Days Since Start",
                           "Wind Velocity",
                           "Time of Day")) +
    labs(title = "Transect walk: Shannon Index of Pollinator", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

## days since start netting_shannon_mod5_gauss ---------
# Get the original mean and SD of days since start before scaling
days_mean <- mean(envir_data$Days_since_start, na.rm = TRUE)
days_sd <- sd(envir_data$Days_since_start, na.rm = TRUE)
# Get predictions on the scaled variable
pred_days <- ggpredict(netting_shannon_mod6_gauss , terms = "Days_since_start")
# Unscale the x-axis
pred_days$x_unscaled <- (pred_days$x * days_sd) + days_mean
# Plot
ggplot(pred_days, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Days_since_start"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Days_since_start"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Pollinator Shannon Diversity vs Days Since Start",
    x = "Days Since Start",
    y = "Predicted Pollinator Shannon Diversity Index"
  ) 

## minutes since 9am index netting_shannon_mod6_gauss ---------
# Get the original mean and SD of minutes since 9am before scaling
minutes_mean <- mean(envir_data$minutes_since_9am, na.rm = TRUE)
minutes_sd <- sd(envir_data$minutes_since_9am, na.rm = TRUE)
# Get predictions on the scaled variable
pred_minutes <- ggpredict(netting_shannon_mod6_gauss , terms = "minutes_since_9am")
# Unscale the x-axis
pred_minutes$x_unscaled <- (pred_minutes$x * minutes_sd) + minutes_mean
# divide by 60 to get hours, and add 9 to get hour of the day
pred_minutes$x_unscaled <- (pred_minutes$x_unscaled / 60) + 9
# Plot
ggplot(pred_minutes, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["minutes_since_9am"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["minutes_since_9am"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Pollinator Shannon Diversity vs Time of day",
    x = "Time of day",
    y = "Predicted Pollinator Shannon Diversity Index"
  ) 

IV.B.4.c. Interpretation of results

#removal of unnecessary objects - all starting with netting_shannon
rm(list = ls(pattern = "^netting_shannon_"))

IV.B.5 NETTING Simpson Index - Beta regression

The Simpson index per transect was calculated in the previous section and is stored in the netting_diversity data frame.

#histogram of simpson diversity, binwidth =0.01
netting_diversity %>%
  ggplot(aes(x = simpson_diversity)) +
  geom_histogram(binwidth = 0.01, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Simpson Diversity Index",
       x = "Simpson Diversity Index",
       y = "Count")

#testing the normality of the simpson index
shapiro.test(netting_diversity$simpson_diversity) # p-value = 0.0034, simpson index is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  netting_diversity$simpson_diversity
## W = 0.91719, p-value = 0.0034
#testing skewness
datawizard::describe_distribution(netting_diversity$simpson_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 0.86 | 0.05 | 0.07 | [0.74, 0.94] |    -0.97 |     0.40 | 45 |         0

The Simpson index is not normally distributed (shapiro test: p-value = 0.0034) and has a skewness of -0.23, indicating a slight left skew. The kurtosis is 2.4, indicating a platykurtic distribution (flatter than normal), but fairly normal. In this case, where simpson index is bounded between 0 and 1, we can use a beta regression or a binomial regression with a logit link function.

netting_simpson_mod1_beta <- glmmTMB(simpson_diversity 
                                     ~Floral_simpson_index_T 
                                     + minutes_since_9am 
                                     + dm_wind_velocity
                                     + top2_ratio
                                     + Site_type
                                     + dm_temperature
                                     + Days_since_start 
                                     + Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)
summary(netting_simpson_mod1_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     dm_wind_velocity + top2_ratio + Site_type + dm_temperature +  
##     Days_since_start + Plot_Cover_T + (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -150.3   -130.4     86.1   -172.3       34 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 1.112e-10 1.055e-05
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 87.6 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.77926    0.07358  24.181  < 2e-16 ***
## Floral_simpson_index_T   0.02083    0.06858   0.304    0.761    
## minutes_since_9am       -0.11062    0.05304  -2.086    0.037 *  
## dm_wind_velocity         0.10846    0.06945   1.562    0.118    
## top2_ratio               0.01549    0.05170   0.300    0.765    
## Site_typeyoung_restored  0.15242    0.13212   1.154    0.249    
## dm_temperature           0.02040    0.07938   0.257    0.797    
## Days_since_start        -0.28590    0.05778  -4.948  7.5e-07 ***
## Plot_Cover_T             0.03120    0.07717   0.404    0.686    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod1_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.78 | 0.07 | [ 1.64,  1.92] | 24.18 | < .001
## Floral simpson index T     |        0.02 | 0.07 | [-0.11,  0.16] |  0.30 | 0.761 
## minutes since 9am          |       -0.11 | 0.05 | [-0.21, -0.01] | -2.09 | 0.037 
## dm wind velocity           |        0.11 | 0.07 | [-0.03,  0.24] |  1.56 | 0.118 
## top2 ratio                 |        0.02 | 0.05 | [-0.09,  0.12] |  0.30 | 0.765 
## Site type [young_restored] |        0.15 | 0.13 | [-0.11,  0.41] |  1.15 | 0.249 
## dm temperature             |        0.02 | 0.08 | [-0.14,  0.18] |  0.26 | 0.797 
## Days since start           |       -0.29 | 0.06 | [-0.40, -0.17] | -4.95 | < .001
## Plot Cover T               |        0.03 | 0.08 | [-0.12,  0.18] |  0.40 | 0.686 
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       87.61 | [57.98, 132.40]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    1.05e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod1_beta)
## [1] TRUE
#check the model
check_model(netting_simpson_mod1_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod1_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.112
##           p-value = 0.616
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod1_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.22 [1.67, 3.23]         1.49      0.45
##       minutes_since_9am 1.31 [1.10, 1.94]         1.14      0.77
##        dm_wind_velocity 2.34 [1.74, 3.41]         1.53      0.43
##              top2_ratio 1.18 [1.03, 1.91]         1.08      0.85
##               Site_type 2.09 [1.59, 3.04]         1.45      0.48
##          dm_temperature 2.95 [2.14, 4.34]         1.72      0.34
##        Days_since_start 1.60 [1.27, 2.31]         1.26      0.63
##            Plot_Cover_T 2.62 [1.92, 3.83]         1.62      0.38
##  Tolerance 95% CI
##      [0.31, 0.60]
##      [0.51, 0.91]
##      [0.29, 0.57]
##      [0.52, 0.97]
##      [0.33, 0.63]
##      [0.23, 0.47]
##      [0.43, 0.79]
##      [0.26, 0.52]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod1_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod1_beta)
plot(netting_simpson_mod1_beta_sim_res)

# REMOVE top2_ratio (p= 0.835   for netting_simpson_mod1_beta)
netting_simpson_mod2_beta <- glmmTMB(simpson_diversity 
                                     ~Floral_simpson_index_T 
                                     + minutes_since_9am 
                                     + dm_wind_velocity
                                     #+ top2_ratio
                                     + Site_type
                                     + dm_temperature
                                     + Days_since_start 
                                     + Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)

summary(netting_simpson_mod2_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     dm_wind_velocity + Site_type + dm_temperature + Days_since_start +  
##     Plot_Cover_T + (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -152.2   -134.1     86.1   -172.2       35 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev. 
##  site   (Intercept) 2.06e-10 1.435e-05
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 87.4 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.78118    0.07344  24.253  < 2e-16 ***
## Floral_simpson_index_T   0.01755    0.06780   0.259    0.796    
## minutes_since_9am       -0.10865    0.05263  -2.064    0.039 *  
## dm_wind_velocity         0.10870    0.06941   1.566    0.117    
## Site_typeyoung_restored  0.14779    0.13139   1.125    0.261    
## dm_temperature           0.01836    0.07917   0.232    0.817    
## Days_since_start        -0.28635    0.05774  -4.959 7.09e-07 ***
## Plot_Cover_T             0.03273    0.07708   0.425    0.671    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod2_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.78 | 0.07 | [ 1.64,  1.93] | 24.25 | < .001
## Floral simpson index T     |        0.02 | 0.07 | [-0.12,  0.15] |  0.26 | 0.796 
## minutes since 9am          |       -0.11 | 0.05 | [-0.21, -0.01] | -2.06 | 0.039 
## dm wind velocity           |        0.11 | 0.07 | [-0.03,  0.24] |  1.57 | 0.117 
## Site type [young_restored] |        0.15 | 0.13 | [-0.11,  0.41] |  1.12 | 0.261 
## dm temperature             |        0.02 | 0.08 | [-0.14,  0.17] |  0.23 | 0.817 
## Days since start           |       -0.29 | 0.06 | [-0.40, -0.17] | -4.96 | < .001
## Plot Cover T               |        0.03 | 0.08 | [-0.12,  0.18] |  0.42 | 0.671 
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       87.43 | [57.86, 132.12]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    1.44e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod2_beta)
## [1] TRUE
#check the model
check_model(netting_simpson_mod2_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod2_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.109
##           p-value = 0.624
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod2_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.15 [1.61, 3.16]         1.47      0.46
##       minutes_since_9am 1.28 [1.08, 1.95]         1.13      0.78
##        dm_wind_velocity 2.33 [1.72, 3.43]         1.52      0.43
##               Site_type 2.07 [1.56, 3.03]         1.44      0.48
##          dm_temperature 2.94 [2.12, 4.38]         1.72      0.34
##        Days_since_start 1.58 [1.25, 2.31]         1.26      0.63
##            Plot_Cover_T 2.62 [1.91, 3.88]         1.62      0.38
##  Tolerance 95% CI
##      [0.32, 0.62]
##      [0.51, 0.92]
##      [0.29, 0.58]
##      [0.33, 0.64]
##      [0.23, 0.47]
##      [0.43, 0.80]
##      [0.26, 0.52]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod2_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod2_beta)
plot(netting_simpson_mod2_beta_sim_res)

#remove temperature (p= 0.817   for netting_simpson_mod2_beta)

netting_simpson_mod3_beta <- glmmTMB(simpson_diversity 
                                     ~Floral_simpson_index_T 
                                     + minutes_since_9am 
                                     + dm_wind_velocity
                                     #+ top2_ratio
                                     + Site_type
                                     #+ dm_temperature
                                     + Days_since_start 
                                     + Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)
summary(netting_simpson_mod3_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + minutes_since_9am +  
##     dm_wind_velocity + Site_type + Days_since_start + Plot_Cover_T +  
##     (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -154.1   -137.9     86.1   -172.1       36 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 1.519e-10 1.232e-05
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 87.3 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.77259    0.06328  28.011  < 2e-16 ***
## Floral_simpson_index_T   0.01790    0.06776   0.264   0.7917    
## minutes_since_9am       -0.11183    0.05086  -2.199   0.0279 *  
## dm_wind_velocity         0.09796    0.05193   1.886   0.0593 .  
## Site_typeyoung_restored  0.16704    0.10158   1.645   0.1001    
## Days_since_start        -0.28558    0.05770  -4.949 7.45e-07 ***
## Plot_Cover_T             0.04013    0.07022   0.571   0.5677    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod3_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.77 | 0.06 | [ 1.65,  1.90] | 28.01 | < .001
## Floral simpson index T     |        0.02 | 0.07 | [-0.11,  0.15] |  0.26 | 0.792 
## minutes since 9am          |       -0.11 | 0.05 | [-0.21, -0.01] | -2.20 | 0.028 
## dm wind velocity           |        0.10 | 0.05 | [ 0.00,  0.20] |  1.89 | 0.059 
## Site type [young_restored] |        0.17 | 0.10 | [-0.03,  0.37] |  1.64 | 0.100 
## Days since start           |       -0.29 | 0.06 | [-0.40, -0.17] | -4.95 | < .001
## Plot Cover T               |        0.04 | 0.07 | [-0.10,  0.18] |  0.57 | 0.568 
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       87.33 | [57.79, 131.96]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    1.23e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod3_beta)
## [1] TRUE
#check the model
check_model(netting_simpson_mod3_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod3_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.110
##           p-value =  0.56
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod3_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.15 [1.60, 3.20]         1.47      0.47
##       minutes_since_9am 1.19 [1.04, 1.98]         1.09      0.84
##        dm_wind_velocity 1.29 [1.09, 2.00]         1.14      0.77
##               Site_type 1.23 [1.06, 1.97]         1.11      0.81
##        Days_since_start 1.58 [1.25, 2.34]         1.26      0.63
##            Plot_Cover_T 2.17 [1.61, 3.23]         1.47      0.46
##  Tolerance 95% CI
##      [0.31, 0.63]
##      [0.51, 0.96]
##      [0.50, 0.92]
##      [0.51, 0.95]
##      [0.43, 0.80]
##      [0.31, 0.62]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod3_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod3_beta)
plot(netting_simpson_mod3_beta_sim_res)

#remove floral simpson index (p= 0.792         for netting_simpson_mod3_beta)
netting_simpson_mod4_beta <- glmmTMB(simpson_diversity 
                                     #~Floral_simpson_index_T 
                                     ~ minutes_since_9am 
                                     + dm_wind_velocity
                                     #+ top2_ratio
                                     + Site_type
                                     #+ dm_temperature
                                     + Days_since_start 
                                     + Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)
summary(netting_simpson_mod4_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ minutes_since_9am + dm_wind_velocity + Site_type +  
##     Days_since_start + Plot_Cover_T + (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -156.1   -141.6     86.0   -172.1       37 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 1.085e-10 1.042e-05
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 87.2 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.77563    0.06232  28.493  < 2e-16 ***
## minutes_since_9am       -0.11443    0.04990  -2.293   0.0218 *  
## dm_wind_velocity         0.09673    0.05184   1.866   0.0621 .  
## Site_typeyoung_restored  0.15992    0.09801   1.632   0.1028    
## Days_since_start        -0.28630    0.05770  -4.962 6.98e-07 ***
## Plot_Cover_T             0.02672    0.04828   0.554   0.5799    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod4_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.78 | 0.06 | [ 1.65,  1.90] | 28.49 | < .001
## minutes since 9am          |       -0.11 | 0.05 | [-0.21, -0.02] | -2.29 | 0.022 
## dm wind velocity           |        0.10 | 0.05 | [ 0.00,  0.20] |  1.87 | 0.062 
## Site type [young_restored] |        0.16 | 0.10 | [-0.03,  0.35] |  1.63 | 0.103 
## Days since start           |       -0.29 | 0.06 | [-0.40, -0.17] | -4.96 | < .001
## Plot Cover T               |        0.03 | 0.05 | [-0.07,  0.12] |  0.55 | 0.580 
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       87.18 | [57.70, 131.74]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    1.04e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod4_beta)
## [1] TRUE
#check the model
check_model(netting_simpson_mod4_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod4_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.114
##           p-value = 0.552
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod4_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF    VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.15 [1.02,  2.13]         1.07      0.87     [0.47, 0.98]
##   dm_wind_velocity 1.28 [1.08,  2.03]         1.13      0.78     [0.49, 0.93]
##          Site_type 1.15 [1.02,  2.14]         1.07      0.87     [0.47, 0.98]
##   Days_since_start 1.58 [1.24,  2.38]         1.26      0.63     [0.42, 0.81]
##       Plot_Cover_T 1.04 [1.00, 31.32]         1.02      0.96     [0.03, 1.00]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod4_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod4_beta)
plot(netting_simpson_mod4_beta_sim_res)

#REMOVE plot cover t (p= 0.580 for netting_simpson_mod4_beta)

netting_simpson_mod5_beta <- glmmTMB(simpson_diversity 
                                     #~Floral_simpson_index_T 
                                     ~ minutes_since_9am 
                                     + dm_wind_velocity
                                     #+ top2_ratio
                                     + Site_type
                                     #+ dm_temperature
                                     + Days_since_start 
                                     #+ Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)
summary(netting_simpson_mod5_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ minutes_since_9am + dm_wind_velocity + Site_type +  
##     Days_since_start + (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -157.7   -145.1     85.9   -171.7       38 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev. 
##  site   (Intercept) 1.46e-10 1.208e-05
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 86.6 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.77880    0.06236  28.523  < 2e-16 ***
## minutes_since_9am       -0.11737    0.04990  -2.352   0.0187 *  
## dm_wind_velocity         0.09855    0.05235   1.883   0.0597 .  
## Site_typeyoung_restored  0.15345    0.09788   1.568   0.1170    
## Days_since_start        -0.28896    0.05779  -5.000 5.72e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod5_beta)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.78 | 0.06 | [ 1.66,  1.90] | 28.52 | < .001
## minutes since 9am          |       -0.12 | 0.05 | [-0.22, -0.02] | -2.35 | 0.019 
## dm wind velocity           |        0.10 | 0.05 | [ 0.00,  0.20] |  1.88 | 0.060 
## Site type [young_restored] |        0.15 | 0.10 | [-0.04,  0.35] |  1.57 | 0.117 
## Days since start           |       -0.29 | 0.06 | [-0.40, -0.18] | -5.00 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       86.62 | [57.32, 130.89]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    1.21e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod5_beta)
## [1] TRUE
#check the model
check_model(netting_simpson_mod5_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod5_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.111
##           p-value = 0.584
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod5_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.13 [1.01, 2.29]         1.06      0.88     [0.44, 0.99]
##   dm_wind_velocity 1.29 [1.08, 2.06]         1.13      0.78     [0.48, 0.93]
##          Site_type 1.14 [1.01, 2.27]         1.07      0.88     [0.44, 0.99]
##   Days_since_start 1.58 [1.24, 2.41]         1.26      0.63     [0.42, 0.81]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod5_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod5_beta)
plot(netting_simpson_mod5_beta_sim_res)

#remove site type (p= 0.117      for netting_simpson_mod5_beta)
netting_simpson_mod6_beta <- glmmTMB(simpson_diversity 
                                     #~Floral_simpson_index_T 
                                     ~ minutes_since_9am 
                                     + dm_wind_velocity
                                     #+ top2_ratio
                                     #+ Site_type
                                     #+ dm_temperature
                                     + Days_since_start 
                                     #+ Plot_Cover_T
                                     + (1 | site),
                                     family = beta_family(),  
                                     data = netting_diversity)
summary(netting_simpson_mod6_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ minutes_since_9am + dm_wind_velocity + Days_since_start +  
##     (1 | site)
## Data: netting_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##   -157.3   -146.5     84.7   -169.3       39 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev.
##  site   (Intercept) 0.0007677 0.02771 
## Number of obs: 45, groups:  site, 9
## 
## Dispersion parameter for beta family (): 82.7 
## 
## Conditional model:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.84549    0.04891   37.73  < 2e-16 ***
## minutes_since_9am -0.10904    0.05112   -2.13   0.0329 *  
## dm_wind_velocity   0.09407    0.05638    1.67   0.0952 .  
## Days_since_start  -0.25933    0.05669   -4.57 4.77e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(netting_simpson_mod6_beta)
## # Fixed Effects
## 
## Parameter         | Coefficient |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------
## (Intercept)       |        1.85 | 0.05 | [ 1.75,  1.94] | 37.73 | < .001
## minutes since 9am |       -0.11 | 0.05 | [-0.21, -0.01] | -2.13 | 0.033 
## dm wind velocity  |        0.09 | 0.06 | [-0.02,  0.20] |  1.67 | 0.095 
## Days since start  |       -0.26 | 0.06 | [-0.37, -0.15] | -4.57 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |          95% CI
## -------------------------------------------
## (Intercept) |       82.71 | [52.47, 130.37]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |          95% CI
## ----------------------------------------------------
## SD (Intercept: site) |        0.03 | [0.00, 7595.66]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(netting_simpson_mod6_beta)
## [1] FALSE
#check the model
check_model(netting_simpson_mod6_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(netting_simpson_mod6_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.118
##           p-value = 0.568
## No overdispersion detected.
#collinearity
check_collinearity(netting_simpson_mod6_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##               Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  minutes_since_9am 1.16 [1.02, 2.22]         1.08      0.86     [0.45, 0.98]
##   dm_wind_velocity 1.32 [1.09, 2.12]         1.15      0.76     [0.47, 0.92]
##   Days_since_start 1.46 [1.17, 2.29]         1.21      0.68     [0.44, 0.86]
# dharma package - simulate residuals and check model assumptions
netting_simpson_mod6_beta_sim_res <- simulateResiduals(fittedModel = netting_simpson_mod6_beta)
plot(netting_simpson_mod6_beta_sim_res)

IV.B.5.a. Compare the models with the performance package

# Compare the models with the performance package
netting_simpson_beta_comp1 <- compare_performance(netting_simpson_mod1_beta, netting_simpson_mod2_beta, netting_simpson_mod3_beta, netting_simpson_mod4_beta, netting_simpson_mod5_beta,netting_simpson_mod6_beta, metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
## Random effect variances not available. Returned R2 does not account for random effects.
# Print the comparison table
print(netting_simpson_beta_comp1)
## # Comparison of Model Performance Indices
## 
## Name                      |   Model | AICc (weights) |  BIC (weights)
## ---------------------------------------------------------------------
## netting_simpson_mod1_beta | glmmTMB | -142.3 (<.001) | -130.4 (<.001)
## netting_simpson_mod2_beta | glmmTMB | -145.7 (0.004) | -134.1 (0.001)
## netting_simpson_mod3_beta | glmmTMB | -149.0 (0.022) | -137.9 (0.008)
## netting_simpson_mod4_beta | glmmTMB | -152.1 (0.103) | -141.6 (0.054)
## netting_simpson_mod5_beta | glmmTMB | -154.7 (0.391) | -145.1 (0.311)
## netting_simpson_mod6_beta | glmmTMB | -155.1 (0.478) | -146.5 (0.625)
## 
## Name                      | R2 (cond.) | R2 (marg.) |  RMSE |   ICC
## -------------------------------------------------------------------
## netting_simpson_mod1_beta |            |      0.968 | 0.038 |      
## netting_simpson_mod2_beta |            |      0.968 | 0.039 |      
## netting_simpson_mod3_beta |            |      0.968 | 0.039 |      
## netting_simpson_mod4_beta |            |      0.968 | 0.039 |      
## netting_simpson_mod5_beta |            |      0.968 | 0.039 |      
## netting_simpson_mod6_beta |      0.963 |      0.948 | 0.039 | 0.289

IV.B.5.b. visualize the model results

#plot_model(netting_simpson_mod1_beta, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_simpson_mod2_beta, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_simpson_mod3_beta, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_simpson_mod4_beta, type = "est", show.values = TRUE, value.offset = 0.3)

(netsimp_est <- plot_model(netting_simpson_mod6_beta, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c( "Days Since Start",
                           "Wind Velocity (km/h)",
                           "Time of day")) +
    labs(title = "Transect walk: Simpson Index of Pollinator", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0)))  # 0 = left, 1 = right

#plot_model(netting_simpson_mod5_beta, type = "est", show.values = TRUE, value.offset = 0.3)
#plot_model(netting_simpson_mod6_beta, type = "est", show.values = TRUE, value.offset = 0.3) 
## days since start netting_simpson_mod6_beta ---------
# Get the original mean and SD of days since start before scaling
days_mean <- mean(envir_data$Days_since_start, na.rm = TRUE)
days_sd <- sd(envir_data$Days_since_start, na.rm = TRUE)
# Get predictions on the scaled variable
pred_days <- ggpredict(netting_simpson_mod6_beta , terms = "Days_since_start")
# Unscale the x-axis
pred_days$x_unscaled <- (pred_days$x * days_sd) + days_mean
# Plot
ggplot(pred_days, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["Days_since_start"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Days_since_start"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Pollinator Simpson Diversity vs Days Since Start",
    x = "Days Since Start",
    y = "Predicted Pollinator Simpson Diversity Index"
  ) 

## minutes since 9am index netting_simpson_mod6_beta ---------
# Get the original mean and SD of minutes since 9am before scaling
minutes_mean <- mean(envir_data$minutes_since_9am, na.rm = TRUE)
minutes_sd <- sd(envir_data$minutes_since_9am, na.rm = TRUE)
# Get predictions on the scaled variable
pred_minutes <- ggpredict(netting_simpson_mod6_beta , terms = "minutes_since_9am")
# Unscale the x-axis
pred_minutes$x_unscaled <- (pred_minutes$x * minutes_sd) + minutes_mean
# divide by 60 to get hours, and add 9 to get hour of the day
pred_minutes$x_unscaled <- (pred_minutes$x_unscaled / 60) + 9
# Plot
ggplot(pred_minutes, aes(x = x_unscaled, y = predicted)) +
  geom_line(size = 1.2, color = predictor_colors[["minutes_since_9am"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["minutes_since_9am"]], 0.5)) +
  labs(
    title = "Transect walk: Predicted Pollinator Simpson Diversity vs Time of day",
    x = "Time of day",
    y = "Predicted Pollinator Simpson Diversity Index"
  ) 

#combine netsimp_est and the two plots

IV.B.5.c. Interpretation of the model results

The fourth model is the last one that meets the assumptions of the model (residuals vs fitted from the dharma package). The model is a beta regression with a logit link function, which is appropriate for bounded continuous data like the Simpson index. The model includes floral Simpson index (p=), time of day (p=), wind velocity (p=), site type (p=) and sampling date (p=) as fixed effects, and site as a random effect.

#removal of unnecessary objects - all starting with netting_
rm(list = ls(pattern = "^netting_"))

IV.C. PLATFORM CAMERAS

At first, I tried to also have the minutes_since_9am variable for platform cameras, only to realize that it is not relevant in this specific case. We will be able to look at the time at which each insect is captured since it is recorded in the filename of the image, and the starting time of the camera is not a ecologically relevant variable.

IV.C.1. PLATFORM CAMERAS Counts - Poisson glmer (offset: recording time)

# count of insect per transect per site
platty <- platform_camera %>%
  group_by(location, transect) %>%
  summarise(count = n(), .groups = 'drop')  # Count the occurrence

#join the  specific envir_data already in platform_camera1
platform_counts <- platty %>%
  left_join(platform_camera1, by = c("location", "transect"))%>%
  #REMOVE EXTRA COLUMNS "ID" "Site_Tn" "det_conf_mean" "track_ID_imgs" "top1_imgs" "top1_prob_mean" "top1_prob_weighted"
 dplyr::select(-c(start_time, top1,ID, Site_Tn, det_conf_mean, track_ID_imgs, top1_imgs, top1_prob_mean, top1_prob_weighted)) %>%
  #keep only unique rows
  distinct()
  #scale numerical

#histogram of counts, binwidth =0.01
platform_counts %>%
  ggplot(aes(x = count)) +
  geom_histogram(binwidth = 1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Insect Counts",
       x = "Insect Count",
       y = "Count")

rm(platty)

Since this is count data, it is going to be modeled with a Poisson distribution. However, if the data is overdispersed (variance > mean), we will use a negative binomial distribution instead.

# full model with insect counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# Poisson distribution
platform_count_mod1_poiss <- glmmTMB(count 
                                     ~ Floral_simpson_index_T
                                     #+ minutes_since_9am
                                     + top2_ratio
                                     + Site_type
                                     + dm_wind_velocity
                                     + dm_temperature
                                     + Days_since_start
                                     + Plot_Cover_T
                                     + (1 | location),
                                     offset = log(rec_time_min),
                                     family = poisson(),
                                     data = platform_counts)

summary(platform_count_mod1_poiss)
##  Family: poisson  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + top2_ratio + Site_type + dm_wind_velocity +  
##     dm_temperature + Days_since_start + Plot_Cover_T + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    809.8    821.5   -395.9    791.8       18 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.7087   0.8418  
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -3.128054   0.408904  -7.650 2.01e-14 ***
## Floral_simpson_index_T   0.220944   0.044193   4.999 5.75e-07 ***
## top2_ratio               0.025457   0.049101   0.518  0.60414    
## Site_typeyoung_restored  1.276056   0.730777   1.746  0.08078 .  
## dm_wind_velocity        -1.444725   0.443008  -3.261  0.00111 ** 
## dm_temperature          -1.823937   0.437523  -4.169 3.06e-05 ***
## Days_since_start        -0.006418   0.215126  -0.030  0.97620    
## Plot_Cover_T             0.054349   0.006498   8.364  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod1_poiss)
## # Fixed Effects
## 
## Parameter                  |  Log-Mean |       SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------------
## (Intercept)                |     -3.13 |     0.41 | [-3.93, -2.33] | -7.65 | < .001
## Floral simpson index T     |      0.22 |     0.04 | [ 0.13,  0.31] |  5.00 | < .001
## top2 ratio                 |      0.03 |     0.05 | [-0.07,  0.12] |  0.52 | 0.604 
## Site type [young_restored] |      1.28 |     0.73 | [-0.16,  2.71] |  1.75 | 0.081 
## dm wind velocity           |     -1.44 |     0.44 | [-2.31, -0.58] | -3.26 | 0.001 
## dm temperature             |     -1.82 |     0.44 | [-2.68, -0.97] | -4.17 | < .001
## Days since start           | -6.42e-03 |     0.22 | [-0.43,  0.42] | -0.03 | 0.976 
## Plot Cover T               |      0.05 | 6.50e-03 | [ 0.04,  0.07] |  8.36 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.84 | [0.52, 1.37]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_count_mod1_poiss)
## [1] FALSE
#check the model
check_model(platform_count_mod1_poiss, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod1_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  31.760
##   Pearson's Chi-Squared = 571.681
##                 p-value = < 0.001
## Overdispersion detected.
#collinearity
check_collinearity(platform_count_mod1_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.97 [1.47, 3.00]         1.40      0.51
##              top2_ratio 1.07 [1.00, 4.47]         1.03      0.93
##               Site_type 1.57 [1.23, 2.40]         1.25      0.64
##        dm_wind_velocity 2.04 [1.51, 3.10]         1.43      0.49
##          dm_temperature 2.18 [1.60, 3.33]         1.48      0.46
##        Days_since_start 1.35 [1.11, 2.11]         1.16      0.74
##            Plot_Cover_T 2.02 [1.50, 3.07]         1.42      0.50
##  Tolerance 95% CI
##      [0.33, 0.68]
##      [0.22, 1.00]
##      [0.42, 0.81]
##      [0.32, 0.66]
##      [0.30, 0.63]
##      [0.47, 0.90]
##      [0.33, 0.67]
# dharma package - simulate residuals and check model assumptions
platform_count_mod1_poiss_sim_res <- simulateResiduals(fittedModel = platform_count_mod1_poiss)
plot(platform_count_mod1_poiss_sim_res)
## qu = 0.25, log(sigma) = -3.09791 : outer Newton did not converge fully.

There is overdispersion in the model, so we will use a negative binomial distribution instead of poisson distribution.

# full model with insect counts as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# Negative binomial distribution

platform_count_mod1_nb <- glmmTMB(count 
                                   ~ Floral_simpson_index_T 
                                   + top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   + Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)
summary(platform_count_mod1_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + top2_ratio + Site_type + dm_wind_velocity +  
##     dm_temperature + Days_since_start + Plot_Cover_T + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    252.8    265.8   -116.4    232.8       17 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1382   0.3717  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.24 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -2.06462    0.60918  -3.389 0.000701 ***
## Floral_simpson_index_T   0.32914    0.29730   1.107 0.268255    
## top2_ratio               0.03685    0.21646   0.170 0.864817    
## Site_typeyoung_restored  0.20342    0.75193   0.271 0.786750    
## dm_wind_velocity        -0.89562    0.38971  -2.298 0.021552 *  
## dm_temperature          -1.03101    0.44798  -2.301 0.021366 *  
## Days_since_start        -0.02048    0.18117  -0.113 0.909989    
## Plot_Cover_T             0.01262    0.03423   0.369 0.712331    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod1_nb)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |    -2.06 | 0.61 | [-3.26, -0.87] | -3.39 | < .001
## Floral simpson index T     |     0.33 | 0.30 | [-0.25,  0.91] |  1.11 | 0.268 
## top2 ratio                 |     0.04 | 0.22 | [-0.39,  0.46] |  0.17 | 0.865 
## Site type [young_restored] |     0.20 | 0.75 | [-1.27,  1.68] |  0.27 | 0.787 
## dm wind velocity           |    -0.90 | 0.39 | [-1.66, -0.13] | -2.30 | 0.022 
## dm temperature             |    -1.03 | 0.45 | [-1.91, -0.15] | -2.30 | 0.021 
## Days since start           |    -0.02 | 0.18 | [-0.38,  0.33] | -0.11 | 0.910 
## Plot Cover T               |     0.01 | 0.03 | [-0.05,  0.08] |  0.37 | 0.712 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.24 | [0.63, 2.41]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.37 | [0.05, 2.93]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_count_mod1_nb)
## [1] FALSE
#check the model
check_model(platform_count_mod1_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod1_nb)
## # Overdispersion test
## 
##  dispersion ratio = 1.259
##           p-value = 0.472
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod1_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF    VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.55 [1.23,  2.30]         1.25      0.65
##              top2_ratio 1.04 [1.00, 25.99]         1.02      0.96
##               Site_type 2.69 [1.94,  4.04]         1.64      0.37
##        dm_wind_velocity 2.62 [1.89,  3.92]         1.62      0.38
##          dm_temperature 3.95 [2.74,  5.99]         1.99      0.25
##        Days_since_start 1.21 [1.04,  1.97]         1.10      0.83
##            Plot_Cover_T 2.05 [1.54,  3.04]         1.43      0.49
##  Tolerance 95% CI
##      [0.43, 0.81]
##      [0.04, 1.00]
##      [0.25, 0.51]
##      [0.26, 0.53]
##      [0.17, 0.37]
##      [0.51, 0.96]
##      [0.33, 0.65]
# dharma package - simulate residuals and check model assumptions
platform_count_mod1_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod1_nb)
plot(platform_count_mod1_nb_sim_res)

# removing top2_ratio (p= 0.978    for platform_count_mod1_nb)

platform_count_mod2_nb <- glmmTMB(count 
                                   ~ Floral_simpson_index_T 
                                   #+ top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Days_since_start
                                   + Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)
summary(platform_count_mod2_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + Site_type + dm_wind_velocity +  
##     dm_temperature + Days_since_start + Plot_Cover_T + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    250.8    262.5   -116.4    232.8       18 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1259   0.3548  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.22 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -2.04645    0.60443  -3.386  0.00071 ***
## Floral_simpson_index_T   0.32787    0.29833   1.099  0.27176    
## Site_typeyoung_restored  0.18732    0.74534   0.251  0.80156    
## dm_wind_velocity        -0.88530    0.38398  -2.306  0.02113 *  
## dm_temperature          -1.02351    0.44781  -2.286  0.02228 *  
## Days_since_start        -0.02060    0.18010  -0.114  0.90892    
## Plot_Cover_T             0.01238    0.03426   0.361  0.71777    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod2_nb)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |    -2.05 | 0.60 | [-3.23, -0.86] | -3.39 | < .001
## Floral simpson index T     |     0.33 | 0.30 | [-0.26,  0.91] |  1.10 | 0.272 
## Site type [young_restored] |     0.19 | 0.75 | [-1.27,  1.65] |  0.25 | 0.802 
## dm wind velocity           |    -0.89 | 0.38 | [-1.64, -0.13] | -2.31 | 0.021 
## dm temperature             |    -1.02 | 0.45 | [-1.90, -0.15] | -2.29 | 0.022 
## Days since start           |    -0.02 | 0.18 | [-0.37,  0.33] | -0.11 | 0.909 
## Plot Cover T               |     0.01 | 0.03 | [-0.05,  0.08] |  0.36 | 0.718 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.22 | [0.63, 2.40]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.35 | [0.04, 3.41]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_count_mod2_nb)
## [1] FALSE
#check the model
check_model(platform_count_mod2_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod2_nb)
## # Overdispersion test
## 
##  dispersion ratio = 1.465
##           p-value = 0.392
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod2_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.54 [1.22, 2.35]         1.24      0.65
##               Site_type 2.71 [1.93, 4.18]         1.65      0.37
##        dm_wind_velocity 2.60 [1.86, 4.00]         1.61      0.38
##          dm_temperature 4.04 [2.75, 6.31]         2.01      0.25
##        Days_since_start 1.20 [1.04, 2.07]         1.09      0.84
##            Plot_Cover_T 2.05 [1.52, 3.13]         1.43      0.49
##  Tolerance 95% CI
##      [0.42, 0.82]
##      [0.24, 0.52]
##      [0.25, 0.54]
##      [0.16, 0.36]
##      [0.48, 0.96]
##      [0.32, 0.66]
# dharma package - simulate residuals and check model assumptions
platform_count_mod2_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod2_nb)
plot(platform_count_mod2_nb_sim_res)

# removing days since start (p= 0.909      for platform_count_mod2_nb)
platform_count_mod3_nb <- glmmTMB(count 
                                   ~ Floral_simpson_index_T 
                                   #+ top2_ratio
                                   + Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   #+ Days_since_start
                                   + Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)
summary(platform_count_mod3_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + Site_type + dm_wind_velocity +  
##     dm_temperature + Plot_Cover_T + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    248.9    259.2   -116.4    232.9       19 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1293   0.3596  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.23 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -2.05143    0.60139  -3.411 0.000647 ***
## Floral_simpson_index_T   0.32914    0.29555   1.114 0.265417    
## Site_typeyoung_restored  0.18269    0.74464   0.245 0.806197    
## dm_wind_velocity        -0.89773    0.37103  -2.420 0.015538 *  
## dm_temperature          -1.02789    0.44675  -2.301 0.021404 *  
## Plot_Cover_T             0.01278    0.03395   0.377 0.706525    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod3_nb)
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |    -2.05 | 0.60 | [-3.23, -0.87] | -3.41 | < .001
## Floral simpson index T     |     0.33 | 0.30 | [-0.25,  0.91] |  1.11 | 0.265 
## Site type [young_restored] |     0.18 | 0.74 | [-1.28,  1.64] |  0.25 | 0.806 
## dm wind velocity           |    -0.90 | 0.37 | [-1.62, -0.17] | -2.42 | 0.016 
## dm temperature             |    -1.03 | 0.45 | [-1.90, -0.15] | -2.30 | 0.021 
## Plot Cover T               |     0.01 | 0.03 | [-0.05,  0.08] |  0.38 | 0.707 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.23 | [0.63, 2.38]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.36 | [0.04, 3.06]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(platform_count_mod3_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod3_nb)
## # Overdispersion test
## 
##  dispersion ratio = 1.373
##           p-value = 0.376
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod3_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.54 [1.21, 2.43]         1.24      0.65
##               Site_type 2.69 [1.88, 4.25]         1.64      0.37
##        dm_wind_velocity 2.42 [1.72, 3.81]         1.56      0.41
##          dm_temperature 4.02 [2.68, 6.43]         2.00      0.25
##            Plot_Cover_T 2.02 [1.48, 3.17]         1.42      0.49
##  Tolerance 95% CI
##      [0.41, 0.83]
##      [0.24, 0.53]
##      [0.26, 0.58]
##      [0.16, 0.37]
##      [0.32, 0.67]
# dharma package - simulate residuals and check model assumptions
platform_count_mod3_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod3_nb)
plot(platform_count_mod3_nb_sim_res)

#removing site type (p= 0.806    for platform_count_mod3_nb)
platform_count_mod4_nb <- glmmTMB(count 
                                   ~ Floral_simpson_index_T 
                                   #+ top2_ratio
                                   #+ Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   #+ Days_since_start
                                   + Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)

summary(platform_count_mod4_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + dm_wind_velocity + dm_temperature +  
##     Plot_Cover_T + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    246.9    256.0   -116.5    232.9       20 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1259   0.3548  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.22 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            -1.942525   0.410148  -4.736 2.18e-06 ***
## Floral_simpson_index_T  0.315459   0.290774   1.085  0.27797    
## dm_wind_velocity       -0.850387   0.316615  -2.686  0.00723 ** 
## dm_temperature         -0.946338   0.299489  -3.160  0.00158 ** 
## Plot_Cover_T            0.007939   0.027084   0.293  0.76943    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod4_nb)
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |    -1.94 | 0.41 | [-2.75, -1.14] | -4.74 | < .001
## Floral simpson index T |     0.32 | 0.29 | [-0.25,  0.89] |  1.08 | 0.278 
## dm wind velocity       |    -0.85 | 0.32 | [-1.47, -0.23] | -2.69 | 0.007 
## dm temperature         |    -0.95 | 0.30 | [-1.53, -0.36] | -3.16 | 0.002 
## Plot Cover T           | 7.94e-03 | 0.03 | [-0.05,  0.06] |  0.29 | 0.769 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.22 | [0.63, 2.37]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.35 | [0.04, 3.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(platform_count_mod4_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod4_nb)
## # Overdispersion test
## 
##  dispersion ratio = 1.339
##           p-value = 0.352
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod4_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.47 [1.16, 2.41]         1.21      0.68
##        dm_wind_velocity 1.78 [1.33, 2.86]         1.33      0.56
##          dm_temperature 1.81 [1.34, 2.91]         1.34      0.55
##            Plot_Cover_T 1.32 [1.08, 2.25]         1.15      0.76
##  Tolerance 95% CI
##      [0.42, 0.86]
##      [0.35, 0.75]
##      [0.34, 0.74]
##      [0.44, 0.92]
# dharma package - simulate residuals and check model assumptions
platform_count_mod4_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod4_nb)
plot(platform_count_mod4_nb_sim_res)

#removing plot cover t (p= 0.769    for platform_count_mod4_nb)
platform_count_mod5_nb <- glmmTMB(count 
                                   ~ Floral_simpson_index_T 
                                   #+ top2_ratio
                                   #+ Site_type
                                   + dm_wind_velocity
                                   + dm_temperature
                                   #+ Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)

summary(platform_count_mod5_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Floral_simpson_index_T + dm_wind_velocity + dm_temperature +  
##     (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    245.0    252.8   -116.5    233.0       21 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1429   0.378   
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.23 
## 
## Conditional model:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -1.8560     0.2937  -6.320 2.61e-10 ***
## Floral_simpson_index_T   0.2773     0.2589   1.071   0.2841    
## dm_wind_velocity        -0.8556     0.3189  -2.683   0.0073 ** 
## dm_temperature          -0.9383     0.2990  -3.138   0.0017 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod5_nb)
## # Fixed Effects
## 
## Parameter              | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------
## (Intercept)            |    -1.86 | 0.29 | [-2.43, -1.28] | -6.32 | < .001
## Floral simpson index T |     0.28 | 0.26 | [-0.23,  0.78] |  1.07 | 0.284 
## dm wind velocity       |    -0.86 | 0.32 | [-1.48, -0.23] | -2.68 | 0.007 
## dm temperature         |    -0.94 | 0.30 | [-1.52, -0.35] | -3.14 | 0.002 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.23 | [0.65, 2.34]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.38 | [0.06, 2.38]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(platform_count_mod5_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod5_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.996
##           p-value = 0.384
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod5_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.17 [1.02, 2.55]         1.08      0.86
##        dm_wind_velocity 1.74 [1.29, 2.88]         1.32      0.58
##          dm_temperature 1.76 [1.30, 2.91]         1.32      0.57
##  Tolerance 95% CI
##      [0.39, 0.98]
##      [0.35, 0.78]
##      [0.34, 0.77]
# dharma package - simulate residuals and check model assumptions
platform_count_mod5_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod5_nb)
plot(platform_count_mod5_nb_sim_res)

#remove floral simpson index (p= 0.284     for platform_count_mod5_nb)
platform_count_mod6_nb <- glmmTMB(count 
                                   #~ Floral_simpson_index_T 
                                   #+ top2_ratio
                                   #+ Site_type
                                   ~ dm_wind_velocity
                                   + dm_temperature
                                   #+ Days_since_start
                                   #+ Plot_Cover_T
                                   + (1 | location),
                                   offset = log(rec_time_min),
                                   family = nbinom2(),
                                   data = platform_counts)
summary(platform_count_mod6_nb)
##  Family: nbinom2  ( log )
## Formula:          count ~ dm_wind_velocity + dm_temperature + (1 | location)
## Data: platform_counts
##  Offset: log(rec_time_min)
## 
##      AIC      BIC   logLik deviance df.resid 
##    244.2    250.7   -117.1    234.2       22 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.2672   0.5169  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for nbinom2 family (): 1.28 
## 
## Conditional model:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       -1.9575     0.2968  -6.595 4.24e-11 ***
## dm_wind_velocity  -0.9755     0.3343  -2.918 0.003518 ** 
## dm_temperature    -1.0505     0.3137  -3.348 0.000813 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_count_mod6_nb)
## # Fixed Effects
## 
## Parameter        | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------
## (Intercept)      |    -1.96 | 0.30 | [-2.54, -1.38] | -6.60 | < .001
## dm wind velocity |    -0.98 | 0.33 | [-1.63, -0.32] | -2.92 | 0.004 
## dm temperature   |    -1.05 | 0.31 | [-1.67, -0.44] | -3.35 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        1.28 | [0.69, 2.39]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.52 | [0.19, 1.41]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(platform_count_mod6_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_count_mod6_nb)
## # Overdispersion test
## 
##  dispersion ratio = 1.177
##           p-value =  0.52
## No overdispersion detected.
#collinearity
check_collinearity(platform_count_mod6_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  dm_wind_velocity 1.50 [1.16, 2.61]         1.23      0.67     [0.38, 0.86]
##    dm_temperature 1.50 [1.16, 2.61]         1.23      0.67     [0.38, 0.86]
# dharma package - simulate residuals and check model assumptions
platform_count_mod6_nb_sim_res <- simulateResiduals(fittedModel = platform_count_mod6_nb)
plot(platform_count_mod6_nb_sim_res)

IV.C.1.a. Compare the models with the performance package
# Compare the models with the performance package
platform_count_nb_comp1 <- compare_performance(platform_count_mod1_nb, platform_count_mod2_nb, platform_count_mod3_nb, platform_count_mod4_nb,platform_count_mod5_nb,platform_count_mod6_nb,
                                               metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(platform_count_nb_comp1)
## # Comparison of Model Performance Indices
## 
## Name                   |   Model | AICc (weights) | BIC (weights) | R2 (cond.)
## ------------------------------------------------------------------------------
## platform_count_mod1_nb | glmmTMB |  266.6 (<.001) | 265.8 (<.001) |      0.351
## platform_count_mod2_nb | glmmTMB |  261.4 (<.001) | 262.5 (0.002) |      0.348
## platform_count_mod3_nb | glmmTMB |  256.9 (0.005) | 259.2 (0.010) |      0.346
## platform_count_mod4_nb | glmmTMB |  252.8 (0.040) | 256.0 (0.049) |      0.344
## platform_count_mod5_nb | glmmTMB |  249.2 (0.246) | 252.8 (0.246) |      0.351
## platform_count_mod6_nb | glmmTMB |  247.1 (0.708) | 250.7 (0.693) |      0.359
## 
## Name                   | R2 (marg.) |   ICC |   RMSE
## ----------------------------------------------------
## platform_count_mod1_nb |      0.311 | 0.059 | 58.238
## platform_count_mod2_nb |      0.310 | 0.054 | 58.499
## platform_count_mod3_nb |      0.307 | 0.055 | 58.359
## platform_count_mod4_nb |      0.306 | 0.054 | 59.030
## platform_count_mod5_nb |      0.309 | 0.061 | 59.812
## platform_count_mod6_nb |      0.282 | 0.108 | 56.465

The third model is the best one because it has lower AICc and BIC values, but still reaches the model’s assumptions (dharma plot of fitted vs residual looks ok, no overdispersion, etc).

IV.C.1.b. visualize the model results

#plot_model(platform_count_mod1_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_count_mod2_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_count_mod3_nb , type = "est", show.values = TRUE, value.offset = .3)
plot_model(platform_count_mod4_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_count_mod6_nb, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c(#"Flower Cover % per Transect",
                           "Temperature",
                           "Wind Velocity (km/h)"#,
                           #"Floral Simpson Index"
                           )) +
    labs(title = "Platform Camera: Count of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

#plot_model(platform_count_mod5_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_count_mod6_nb , type = "est", show.values = TRUE, value.offset = .3) 
## temperature platform_count_mod1_nb ---------
# Get the original mean and SD of wind velocity before scaling
temp_mean <- mean(envir_data$dm_temperature, na.rm = TRUE)
temp_sd <- sd(envir_data$dm_temperature, na.rm = TRUE)

# Get predictions on the scaled variable
pred_temp <- ggpredict(platform_count_mod3_nb , terms = "dm_temperature")

# Unscale the x-axis
pred_temp$x_unscaled <- (pred_temp$x * temp_sd) + temp_mean

# Plot
ggplot(pred_temp, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["dm_temperature"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_temperature"]], 0.5)) +
  labs(
    title = "Platform cameras: Predicted Insect Count vs Temperature",
    x = "Temperature",
    y = "Predicted Insect Count")

## wind velocity platform_count_mod1_nb ---------
# Get the original mean and SD of wind velocity before scaling
wind_mean <- mean(envir_data$dm_wind_velocity, na.rm = TRUE)
wind_sd <- sd(envir_data$dm_wind_velocity, na.rm = TRUE)

# Get predictions on the scaled variable
pred_wind <- ggpredict(platform_count_mod3_nb , terms = "dm_wind_velocity")

# Unscale the x-axis
pred_wind$x_unscaled <- (pred_wind$x * wind_sd) + wind_mean

# Plot
(wind_count_platform <-ggplot(pred_wind, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["dm_wind_velocity"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_wind_velocity"]], 0.5)) +
  labs(
    title = "Platform cameras: Predicted Insect Count vs Wind velocity",
    x = "Wind velocity (km/h)",
    y = "Predicted Insect Count"
  ))

IV.C.1.c. Interpretation of the model results

#remove all objects starting with platform_count_
rm(list = ls(pattern = "^platform_count_"))

IV.C.2. PLATFORM CAMERAS Richness - Poisson glmer

# diversity of insects per transect per site
platty <- platform_camera %>%
  group_by(location, transect) %>%
  summarise(richness = n_distinct(top1), .groups = 'drop')  # Count the occurrences of each order

#join the scaled_envir_data
platform_richness <- platty %>%
  left_join(platform_camera1, by = c("location", "transect"))%>%
  #REMOVE EXTRA COLUMNS "ID" "Site_Tn" "det_conf_mean" "track_ID_imgs" "top1_imgs" "top1_prob_mean" "top1_prob_weighted"
 dplyr::select(-c(top1,ID, Site_Tn, det_conf_mean, track_ID_imgs, top1_imgs, top1_prob_mean, top1_prob_weighted,start_time)) %>%
  #keep only unique rows
  distinct()

#histogram of richness
platform_richness %>%
  ggplot(aes(x = richness)) +
  geom_histogram(binwidth = 1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Insect Richness",
       x = "Insect Richness",
       y = "Count")

#testing the normality of the shannon index
shapiro.test(platform_richness$richness) # p-value = 0.4406, shannon index is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  platform_richness$richness
## W = 0.96342, p-value = 0.4406
#testing skewness
datawizard::describe_distribution(platform_richness$richness)
## Mean |   SD | IQR |         Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 5.19 | 2.72 |   4 | [1.00, 11.00] |     0.12 |    -0.67 | 27 |         0

The data for richness captured with the platform cameras is normally distributed (shapiro test: p-value = 0.4406) and has a skewness of 0.12, indicating a slight right skew. The kurtosis is -0.67, indicating a platykurtic distribution (flatter than normal), but fairly normal.

# full model with insect richness as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# Poisson distribution

platform_richness_mod1_poiss <- glmmTMB(richness 
                                        ~ Floral_simpson_index_T 
                                        + rec_time_min
                                        + top2_ratio
                                        + Site_type
                                        + dm_wind_velocity
                                        + dm_temperature
                                        + Days_since_start
                                        + Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
summary(platform_richness_mod1_poiss)
##  Family: poisson  ( log )
## Formula:          
## richness ~ Floral_simpson_index_T + rec_time_min + top2_ratio +  
##     Site_type + dm_wind_velocity + dm_temperature + Days_since_start +  
##     Plot_Cover_T + (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    122.0    134.9    -51.0    102.0       17 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 1.092e-10 1.045e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.189973   0.657481   1.810 0.070312 .  
## Floral_simpson_index_T   0.188121   0.122310   1.538 0.124031    
## rec_time_min             0.000641   0.001793   0.357 0.720790    
## top2_ratio               0.052077   0.079484   0.655 0.512345    
## Site_typeyoung_restored  0.554649   0.298623   1.857 0.063260 .  
## dm_wind_velocity        -0.455449   0.164837  -2.763 0.005727 ** 
## dm_temperature          -0.677503   0.191176  -3.544 0.000394 ***
## Days_since_start        -0.077615   0.069092  -1.123 0.261289    
## Plot_Cover_T             0.022637   0.012887   1.757 0.079002 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod1_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |       SE |         95% CI |     z |      p
## ----------------------------------------------------------------------------------
## (Intercept)                |     1.19 |     0.66 | [-0.10,  2.48] |  1.81 | 0.070 
## Floral simpson index T     |     0.19 |     0.12 | [-0.05,  0.43] |  1.54 | 0.124 
## rec time min               | 6.41e-04 | 1.79e-03 | [ 0.00,  0.00] |  0.36 | 0.721 
## top2 ratio                 |     0.05 |     0.08 | [-0.10,  0.21] |  0.66 | 0.512 
## Site type [young_restored] |     0.55 |     0.30 | [-0.03,  1.14] |  1.86 | 0.063 
## dm wind velocity           |    -0.46 |     0.16 | [-0.78, -0.13] | -2.76 | 0.006 
## dm temperature             |    -0.68 |     0.19 | [-1.05, -0.30] | -3.54 | < .001
## Days since start           |    -0.08 |     0.07 | [-0.21,  0.06] | -1.12 | 0.261 
## Plot Cover T               |     0.02 |     0.01 | [ 0.00,  0.05] |  1.76 | 0.079 
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.05e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod1_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod1_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod1_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.666
##   Pearson's Chi-Squared = 11.319
##                 p-value =  0.839
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod1_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.14 [1.60, 3.18]         1.46      0.47
##            rec_time_min 1.38 [1.14, 2.08]         1.18      0.72
##              top2_ratio 1.13 [1.01, 2.18]         1.06      0.89
##               Site_type 2.87 [2.05, 4.31]         1.69      0.35
##        dm_wind_velocity 3.76 [2.62, 5.70]         1.94      0.27
##        Days_since_start 1.29 [1.08, 1.99]         1.13      0.78
##            Plot_Cover_T 2.98 [2.13, 4.49]         1.73      0.34
##  Tolerance 95% CI
##      [0.31, 0.63]
##      [0.48, 0.88]
##      [0.46, 0.99]
##      [0.23, 0.49]
##      [0.18, 0.38]
##      [0.50, 0.92]
##      [0.22, 0.47]
## 
## Moderate Correlation
## 
##            Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  dm_temperature 5.54 [3.75, 8.49]         2.35      0.18     [0.12, 0.27]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod1_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod1_poiss)
plot(platform_richness_mod1_poiss_sim_res)

The Poisson model seems to fit this data well, there is no sign of overdispersion and the correlation between predictors is low.

#remove rec time min (p= 0.716     for platform_richness_mod1_poiss)

platform_richness_mod2_poiss <- glmmTMB(richness 
                                        ~ Floral_simpson_index_T 
                                        + top2_ratio
                                        #+ rec_time_min
                                        + Site_type
                                        + dm_wind_velocity
                                        + dm_temperature
                                        + Days_since_start
                                        + Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
summary(platform_richness_mod2_poiss)
##  Family: poisson  ( log )
## Formula:          richness ~ Floral_simpson_index_T + top2_ratio + Site_type +  
##     dm_wind_velocity + dm_temperature + Days_since_start + Plot_Cover_T +  
##     (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    120.1    131.8    -51.0    102.1       18 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 8.469e-11 9.203e-06
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.40658    0.22806   6.168 6.94e-10 ***
## Floral_simpson_index_T   0.19082    0.12213   1.562 0.118174    
## top2_ratio               0.04920    0.07907   0.622 0.533790    
## Site_typeyoung_restored  0.54196    0.29362   1.846 0.064921 .  
## dm_wind_velocity        -0.47045    0.15969  -2.946 0.003218 ** 
## dm_temperature          -0.69654    0.18339  -3.798 0.000146 ***
## Days_since_start        -0.07364    0.06794  -1.084 0.278389    
## Plot_Cover_T             0.02336    0.01272   1.836 0.066292 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod2_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     1.41 | 0.23 | [ 0.96,  1.85] |  6.17 | < .001
## Floral simpson index T     |     0.19 | 0.12 | [-0.05,  0.43] |  1.56 | 0.118 
## top2 ratio                 |     0.05 | 0.08 | [-0.11,  0.20] |  0.62 | 0.534 
## Site type [young_restored] |     0.54 | 0.29 | [-0.03,  1.12] |  1.85 | 0.065 
## dm wind velocity           |    -0.47 | 0.16 | [-0.78, -0.16] | -2.95 | 0.003 
## dm temperature             |    -0.70 | 0.18 | [-1.06, -0.34] | -3.80 | < .001
## Days since start           |    -0.07 | 0.07 | [-0.21,  0.06] | -1.08 | 0.278 
## Plot Cover T               |     0.02 | 0.01 | [ 0.00,  0.05] |  1.84 | 0.066 
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    9.20e-06 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod2_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod2_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod2_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.641
##   Pearson's Chi-Squared = 11.535
##                 p-value =   0.87
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod2_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.13 [1.57, 3.25]         1.46      0.47
##              top2_ratio 1.11 [1.01, 2.54]         1.05      0.90
##               Site_type 2.77 [1.96, 4.27]         1.66      0.36
##        dm_wind_velocity 3.53 [2.43, 5.48]         1.88      0.28
##        Days_since_start 1.25 [1.06, 2.05]         1.12      0.80
##            Plot_Cover_T 2.90 [2.04, 4.47]         1.70      0.34
##  Tolerance 95% CI
##      [0.31, 0.64]
##      [0.39, 0.99]
##      [0.23, 0.51]
##      [0.18, 0.41]
##      [0.49, 0.94]
##      [0.22, 0.49]
## 
## Moderate Correlation
## 
##            Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  dm_temperature 5.09 [3.39, 7.99]         2.26      0.20     [0.13, 0.29]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod2_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod2_poiss)
plot(platform_richness_mod2_poiss_sim_res)

#remove top2 ratio (p= 0.642      for platform_richness_mod2_poiss)
platform_richness_mod3_poiss <- glmmTMB(richness 
                                        ~ Floral_simpson_index_T 
                                        #+ top2_ratio
                                        + Site_type
                                        #+ rec_time_min
                                        + dm_wind_velocity
                                        + dm_temperature
                                        + Days_since_start
                                        + Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)

summary(platform_richness_mod3_poiss)
##  Family: poisson  ( log )
## Formula:          
## richness ~ Floral_simpson_index_T + Site_type + dm_wind_velocity +  
##     dm_temperature + Days_since_start + Plot_Cover_T + (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    118.5    128.8    -51.2    102.5       19 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 1.694e-10 1.301e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.41885    0.22637   6.268 3.66e-10 ***
## Floral_simpson_index_T   0.17904    0.11968   1.496 0.134646    
## Site_typeyoung_restored  0.53059    0.29298   1.811 0.070137 .  
## dm_wind_velocity        -0.45904    0.15886  -2.890 0.003857 ** 
## dm_temperature          -0.69583    0.18302  -3.802 0.000144 ***
## Days_since_start        -0.07216    0.06780  -1.064 0.287167    
## Plot_Cover_T             0.02309    0.01265   1.826 0.067912 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod3_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     1.42 | 0.23 | [ 0.98,  1.86] |  6.27 | < .001
## Floral simpson index T     |     0.18 | 0.12 | [-0.06,  0.41] |  1.50 | 0.135 
## Site type [young_restored] |     0.53 | 0.29 | [-0.04,  1.10] |  1.81 | 0.070 
## dm wind velocity           |    -0.46 | 0.16 | [-0.77, -0.15] | -2.89 | 0.004 
## dm temperature             |    -0.70 | 0.18 | [-1.05, -0.34] | -3.80 | < .001
## Days since start           |    -0.07 | 0.07 | [-0.21,  0.06] | -1.06 | 0.287 
## Plot Cover T               |     0.02 | 0.01 | [ 0.00,  0.05] |  1.83 | 0.068 
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.30e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod3_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod3_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod3_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.618
##   Pearson's Chi-Squared = 11.750
##                 p-value =  0.896
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod3_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.08 [1.52, 3.26]         1.44      0.48
##               Site_type 2.76 [1.92, 4.36]         1.66      0.36
##        dm_wind_velocity 3.51 [2.37, 5.59]         1.87      0.29
##        Days_since_start 1.28 [1.07, 2.14]         1.13      0.78
##            Plot_Cover_T 2.82 [1.96, 4.46]         1.68      0.35
##  Tolerance 95% CI
##      [0.31, 0.66]
##      [0.23, 0.52]
##      [0.18, 0.42]
##      [0.47, 0.94]
##      [0.22, 0.51]
## 
## Moderate Correlation
## 
##            Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  dm_temperature 5.02 [3.28, 8.08]         2.24      0.20     [0.12, 0.30]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod3_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod3_poiss)
plot(platform_richness_mod3_poiss_sim_res)

#remove days since start (p= 0.287   for platform_richness_mod3_poiss)
platform_richness_mod4_poiss <- glmmTMB(richness 
                                        ~ Floral_simpson_index_T 
                                        #+ top2_ratio
                                        #+ rec_time_min
                                        + dm_wind_velocity
                                        + Site_type
                                        + dm_temperature
                                        #+ Days_since_start
                                        + Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
summary(platform_richness_mod4_poiss)
##  Family: poisson  ( log )
## Formula:          
## richness ~ Floral_simpson_index_T + dm_wind_velocity + Site_type +  
##     dm_temperature + Plot_Cover_T + (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    117.6    126.7    -51.8    103.6       20 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 1.089e-10 1.044e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.39525    0.22736   6.137 8.43e-10 ***
## Floral_simpson_index_T   0.20001    0.11964   1.672 0.094577 .  
## dm_wind_velocity        -0.52573    0.14536  -3.617 0.000298 ***
## Site_typeyoung_restored  0.55761    0.29445   1.894 0.058261 .  
## dm_temperature          -0.74058    0.18069  -4.099 4.15e-05 ***
## Plot_Cover_T             0.02558    0.01264   2.025 0.042908 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod4_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     1.40 | 0.23 | [ 0.95,  1.84] |  6.14 | < .001
## Floral simpson index T     |     0.20 | 0.12 | [-0.03,  0.43] |  1.67 | 0.095 
## dm wind velocity           |    -0.53 | 0.15 | [-0.81, -0.24] | -3.62 | < .001
## Site type [young_restored] |     0.56 | 0.29 | [-0.02,  1.13] |  1.89 | 0.058 
## dm temperature             |    -0.74 | 0.18 | [-1.09, -0.39] | -4.10 | < .001
## Plot Cover T               |     0.03 | 0.01 | [ 0.00,  0.05] |  2.02 | 0.043 
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.04e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod4_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod4_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod4_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.627
##   Pearson's Chi-Squared = 12.533
##                 p-value =  0.897
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod4_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 2.05 [1.48, 3.30]         1.43      0.49
##        dm_wind_velocity 2.95 [2.00, 4.79]         1.72      0.34
##               Site_type 2.79 [1.91, 4.52]         1.67      0.36
##          dm_temperature 4.77 [3.07, 7.87]         2.18      0.21
##            Plot_Cover_T 2.75 [1.88, 4.46]         1.66      0.36
##  Tolerance 95% CI
##      [0.30, 0.68]
##      [0.21, 0.50]
##      [0.22, 0.52]
##      [0.13, 0.33]
##      [0.22, 0.53]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod4_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod4_poiss)
plot(platform_richness_mod4_poiss_sim_res)

#remove Floral simpson index T  (p= 0.095    for platform_richness_mod4_poiss)
platform_richness_mod5_poiss <- glmmTMB(richness 
                                        #~ Floral_simpson_index_T 
                                        #+ top2_ratio
                                        ~ Site_type
                                        #+ rec_time_min
                                        + dm_wind_velocity
                                        + dm_temperature
                                        #+ Days_since_start
                                        + Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
summary(platform_richness_mod5_poiss)
##  Family: poisson  ( log )
## Formula:          
## richness ~ Site_type + dm_wind_velocity + dm_temperature + Plot_Cover_T +  
##     (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    118.4    126.2    -53.2    106.4       21 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 1.527e-10 1.236e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              1.575225   0.195606   8.053 8.08e-16 ***
## Site_typeyoung_restored  0.363724   0.266760   1.363 0.172729    
## dm_wind_velocity        -0.523148   0.143098  -3.656 0.000256 ***
## dm_temperature          -0.693798   0.174852  -3.968 7.25e-05 ***
## Plot_Cover_T             0.011621   0.009457   1.229 0.219144    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod5_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |       SE |         95% CI |     z |      p
## ----------------------------------------------------------------------------------
## (Intercept)                |     1.58 |     0.20 | [ 1.19,  1.96] |  8.05 | < .001
## Site type [young_restored] |     0.36 |     0.27 | [-0.16,  0.89] |  1.36 | 0.173 
## dm wind velocity           |    -0.52 |     0.14 | [-0.80, -0.24] | -3.66 | < .001
## dm temperature             |    -0.69 |     0.17 | [-1.04, -0.35] | -3.97 | < .001
## Plot Cover T               |     0.01 | 9.46e-03 | [-0.01,  0.03] |  1.23 | 0.219 
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.24e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod5_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod5_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod5_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.758
##   Pearson's Chi-Squared = 15.926
##                 p-value =  0.774
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod5_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##         Site_type 2.29 [1.59, 3.79]         1.51      0.44     [0.26, 0.63]
##  dm_wind_velocity 2.82 [1.90, 4.71]         1.68      0.35     [0.21, 0.53]
##    dm_temperature 4.44 [2.81, 7.50]         2.11      0.23     [0.13, 0.36]
##      Plot_Cover_T 1.46 [1.14, 2.47]         1.21      0.68     [0.40, 0.87]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod5_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod5_poiss)
plot(platform_richness_mod5_poiss_sim_res)

#remove plot cover t (p= 0.219    for platform_richness_mod5_poiss)
platform_richness_mod6_poiss <- glmmTMB(richness 
                                        #~ Floral_simpson_index_T 
                                        #+ top2_ratio
                                        ~ Site_type
                                        #+ rec_time_min
                                        + dm_wind_velocity
                                        + dm_temperature
                                        #+ Days_since_start
                                        #+ Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
summary(platform_richness_mod6_poiss)
##  Family: poisson  ( log )
## Formula:          
## richness ~ Site_type + dm_wind_velocity + dm_temperature + (1 |      location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    117.9    124.3    -53.9    107.9       22 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev. 
##  location (Intercept) 1.29e-10 1.136e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)               1.7656     0.1099  16.071  < 2e-16 ***
## Site_typeyoung_restored   0.2019     0.2280   0.885 0.375948    
## dm_wind_velocity         -0.4583     0.1331  -3.443 0.000576 ***
## dm_temperature           -0.5867     0.1506  -3.896 9.79e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod6_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------------
## (Intercept)                |     1.77 | 0.11 | [ 1.55,  1.98] | 16.07 | < .001
## Site type [young_restored] |     0.20 | 0.23 | [-0.25,  0.65] |  0.89 | 0.376 
## dm wind velocity           |    -0.46 | 0.13 | [-0.72, -0.20] | -3.44 | < .001
## dm temperature             |    -0.59 | 0.15 | [-0.88, -0.29] | -3.90 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.14e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod6_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod6_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod6_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.788
##   Pearson's Chi-Squared = 17.336
##                 p-value =  0.745
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod6_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##         Site_type 1.67 [1.24, 2.86]         1.29      0.60     [0.35, 0.80]
##  dm_wind_velocity 2.45 [1.66, 4.18]         1.56      0.41     [0.24, 0.60]
##    dm_temperature 3.28 [2.12, 5.64]         1.81      0.31     [0.18, 0.47]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod6_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod6_poiss)
plot(platform_richness_mod6_poiss_sim_res)

#remove site type (p= 0.376     for platform_richness_mod6_poiss)
platform_richness_mod7_poiss <- glmmTMB(richness 
                                        #~ Floral_simpson_index_T 
                                        #+ top2_ratio
                                        #+ Site_type
                                        ~ dm_wind_velocity
                                        + dm_temperature
                                        #+ Days_since_start
                                        #+ Plot_Cover_T
                                        + (1 | location),
                                        family = poisson(),
                                        data = platform_richness)
summary(platform_richness_mod7_poiss)
##  Family: poisson  ( log )
## Formula:          richness ~ dm_wind_velocity + dm_temperature + (1 | location)
## Data: platform_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    116.7    121.8    -54.3    108.7       23 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance  Std.Dev. 
##  location (Intercept) 1.832e-10 1.353e-05
## Number of obs: 27, groups:  location, 9
## 
## Conditional model:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       1.82152    0.08773  20.762  < 2e-16 ***
## dm_wind_velocity -0.40730    0.11846  -3.438 0.000585 ***
## dm_temperature   -0.50531    0.11532  -4.382 1.18e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_richness_mod7_poiss)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter        | Log-Mean |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------
## (Intercept)      |     1.82 | 0.09 | [ 1.65,  1.99] | 20.76 | < .001
## dm wind velocity |    -0.41 | 0.12 | [-0.64, -0.18] | -3.44 | < .001
## dm temperature   |    -0.51 | 0.12 | [-0.73, -0.28] | -4.38 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |      95% CI
## ----------------------------------------------------
## SD (Intercept: location) |    1.35e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_richness_mod7_poiss)
## [1] TRUE
#check the model
check_model(platform_richness_mod7_poiss, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_richness_mod7_poiss)
## # Overdispersion test
## 
##        dispersion ratio =  0.783
##   Pearson's Chi-Squared = 18.000
##                 p-value =  0.757
## No overdispersion detected.
#collinearity
check_collinearity(platform_richness_mod7_poiss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  dm_wind_velocity 1.93 [1.36, 3.37]         1.39      0.52     [0.30, 0.73]
##    dm_temperature 1.93 [1.36, 3.37]         1.39      0.52     [0.30, 0.73]
# dharma package - simulate residuals and check model assumptions
platform_richness_mod7_poiss_sim_res <- simulateResiduals(fittedModel = platform_richness_mod7_poiss)
plot(platform_richness_mod7_poiss_sim_res)

IV.C.2.a. Compare the models with the performance package

# Compare the models with the performance package
platform_richness_poiss_comp1 <- compare_performance(platform_richness_mod1_poiss, platform_richness_mod2_poiss, platform_richness_mod3_poiss, platform_richness_mod4_poiss, platform_richness_mod5_poiss, platform_richness_mod6_poiss, platform_richness_mod7_poiss,
                                                     platform_richness_mod5_poiss,platform_richness_mod6_poiss, metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))

# Print the comparison table
print(platform_richness_poiss_comp1)
## # Comparison of Model Performance Indices
## 
## Name                         |   Model | AICc (weights) | BIC (weights) |  RMSE
## -------------------------------------------------------------------------------
## platform_richness_mod6_poiss | glmmTMB |  120.7 (0.209) | 124.3 (0.187) | 1.786
## platform_richness_mod5_poiss | glmmTMB |  122.6 (0.081) | 126.2 (0.074) | 1.703
## platform_richness_mod1_poiss | glmmTMB |  135.7 (<.001) | 134.9 (<.001) | 1.477
## platform_richness_mod2_poiss | glmmTMB |  130.7 (0.001) | 131.8 (0.005) | 1.490
## platform_richness_mod3_poiss | glmmTMB |  126.5 (0.012) | 128.8 (0.020) | 1.493
## platform_richness_mod4_poiss | glmmTMB |  123.5 (0.052) | 126.7 (0.058) | 1.498
## platform_richness_mod7_poiss | glmmTMB |  118.5 (0.645) | 121.8 (0.656) | 1.817

IV.C.2.b. Visualize the model results

#plot_model(platform_richness_mod1_poiss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_richness_mod2_poiss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_richness_mod3_poiss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_richness_mod4_poiss , type = "est", show.values = TRUE, value.offset = .3)
plot_model(platform_richness_mod5_poiss , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_richness_mod5_poiss, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Young Restored Site",
                           "Temperature",
                           "Wind Velocity (km/h)",
                           "Floral Simpson Index"
                           )) +
    labs(title = "Platform Camera: Richness of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

plot_model(platform_richness_mod6_poiss , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_richness_mod7_poiss , type = "est", show.values = TRUE, value.offset = .3)

## temperature platform_richness_mod5_poiss ---------
# Get the original mean and SD of wind velocity before scaling
temp_mean <- mean(envir_data$dm_temperature, na.rm = TRUE)
temp_sd <- sd(envir_data$dm_temperature, na.rm = TRUE)
# Get predictions on the scaled variable
pred_temp <- ggpredict(platform_richness_mod5_poiss , terms = "dm_temperature")
# Unscale the x-axis
pred_temp$x_unscaled <- (pred_temp$x * temp_sd) + temp_mean
# Plot
ggplot(pred_temp, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor_color
  geom_line(size = 1.2, color = predictor_colors[["dm_temperature"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_temperature"]], 0.5)) +
  labs(
    title = "Platform cameras: Predicted Insect Richness vs Temperature",
    x = "Temperature",
    y = "Predicted Insect Richness"
  ) 

## wind velocity platform_richness_mod5_poiss ---------
# Get the original mean and SD of wind velocity before scaling
wind_mean <- mean(envir_data$dm_wind_velocity, na.rm = TRUE)
wind_sd <- sd(envir_data$dm_wind_velocity, na.rm = TRUE)
# Get predictions on the scaled variable
pred_wind <- ggpredict(platform_richness_mod5_poiss , terms = "dm_wind_velocity")
# Unscale the x-axis
pred_wind$x_unscaled <- (pred_wind$x * wind_sd) + wind_mean
# Plot
ggplot(pred_wind, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["dm_wind_velocity"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_wind_velocity"]], 0.5)) +
  labs(
    title = "Platform cameras: Predicted Insect Richness vs Wind velocity",
    x = "Wind velocity (m/s)",
    y = "Predicted Insect Richness"
  ) 

IV.C.2.c. interpretation of the model results

This model has a Poisson distribution with a log-link, and an offset for the recording time per transect. We can see that the model is a good fit for the data, with no overdispersion and low correlation. The most parsimonious model is the one with the lowest AICc and BIC values, which is the model with only wind velocity and temperature as fixed effects. This model shows that with increasing wind velocity and temperature, the insect richness captured by the camera each minute decreases The previous models also show that the floral Simpson index, time of day, site type and days since start are not significant predictors of insect richness.

Fourth model also has the one of the lowest AICc and BIC values, but still meets the model’s assumptions (dharma plot of fitted vs residual looks ok, no overdispersion, etc).

#remove all objects starting with platform_richness_
rm(list = ls(pattern = "^platform_richness_"))

IV.C.3. PLATFORM CAMERAS Shannon - GAUSSIAN glmer (rec_time_min as predictor)

# diversity of insects per transect per site
platform_diversity <- platform_camera %>%
  #create new column with counts = 1
  mutate(count = 1) %>%
  #wide format filled with counts, fill empty cells with 0
  pivot_wider(names_from = top1, values_from = count, values_fill = 0) %>%
  #remove irrelevant columns
 dplyr::select(-c(ID, Site_Tn, det_conf_mean, track_ID_imgs, top1_imgs, top1_prob_mean, top1_prob_weighted, cam_ID, start_time))%>%
  #sum up rows that have same transect and site
  group_by(location, transect,date) %>%
  summarise(across(fly_sarco:fly_empi, sum), .groups = 'drop') %>%
  #calculate shannon index and simpson index
  mutate(shannon_diversity = diversity(across(fly_sarco:fly_empi), index = "shannon"),
         simpson_diversity = diversity(across(fly_sarco:fly_empi), index = "simpson"))%>%
  #remove irrelevant columns
 dplyr::select(-c(fly_sarco:fly_empi), -date) %>%
  #join the scaled_envir_data
  left_join(platform_camera1, by = c("location", "transect"))%>%
  #REMOVE EXTRA COLUMNS "ID" "Site_Tn" "det_conf_mean" "track_ID_imgs" "top1_imgs" "top1_prob_mean" "top1_prob_weighted"
 dplyr::select(-c(ID, Site_Tn, det_conf_mean, track_ID_imgs, top1, top1_imgs, top1_prob_mean, top1_prob_weighted, date,cam_ID, start_time)) %>%
  #keep only unique rows
  distinct()

#scale rec_time_min
platform_diversity <- platform_diversity %>%
  mutate(rec_time_min_scaled = scale(rec_time_min))%>%
  #scale plot cover t
  mutate(Plot_Cover_T = scale(Plot_Cover_T))
#histogram of shannon index
platform_diversity %>%
  ggplot(aes(x = shannon_diversity)) +
  geom_histogram(binwidth = 0.1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Insect Shannon Diversity",
       x = "Insect Shannon Diversity",
       y = "Count")

#testing the normality of the shannon index
shapiro.test(platform_diversity$shannon_diversity) # p-value = 0.1193, shannon index is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  platform_diversity$shannon_diversity
## W = 0.93963, p-value = 0.1193
datawizard::describe_distribution(platform_diversity$shannon_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 1.10 | 0.55 | 0.79 | [0.00, 1.99] |    -0.59 |    -0.28 | 27 |         0

The data for shannon diversity captured with the platform cameras is normally distributed (shapiro test: p-value = 0.1193) and has a skewness of -0.28, indicating a slight left skew. The kurtosis is -0.28, indicating a platykurtic distribution (flatter than normal), but fairly normal.

Since a normal distribution is assumed, we will use a gaussian distribution for the model. However, Gaussian models cannot include an offset parameter. Here, there are two options: either add the rec_time_min as a predictor, or divide the diversity index by the recording time, in order to have as a response variable the diversity index per minute of recording.

A first try with rates was not successful. And after some thought, using an index rate doesn’t seem to make a lot of sense, so we decided to use a gaussian distribution with the rec_time_min as a predictor.

# full model with insect shannon diversity as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect, and recording time is included to account for sampling effort differences



platform_shannon_mod1_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   + top2_ratio
                                   + Site_type
                                   + Days_since_start
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)
summary(platform_shannon_mod1_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     top2_ratio + Site_type + Days_since_start + dm_wind_velocity +  
##     dm_temperature + Plot_Cover_T + (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 29.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1772 -0.4823 -0.2646  0.5065  1.8025 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.1863   0.4316  
##  Residual             0.0399   0.1997  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)              1.27825    0.20496   6.237
## Floral_simpson_index_T   0.25600    0.06386   4.009
## rec_time_min_scaled     -0.14405    0.04744  -3.037
## top2_ratio               0.01388    0.05072   0.274
## Site_typeyoung_restored  0.56383    0.37503   1.503
## Days_since_start        -0.15991    0.10974  -1.457
## dm_wind_velocity        -0.25505    0.23016  -1.108
## dm_temperature          -0.58270    0.22429  -2.598
## Plot_Cover_T             0.32793    0.07369   4.450
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__ tp2_rt St_ty_ Dys_s_ dm_wn_ dm_tmp
## Flrl_smp__T  0.075                                                 
## rc_tm_mn_sc -0.058 -0.033                                          
## top2_ratio  -0.086 -0.221  0.088                                   
## St_typyng_r -0.499  0.115 -0.032 -0.012                            
## Dys_snc_str  0.129  0.007 -0.005  0.030 -0.137                     
## dm_wnd_vlct -0.172  0.025  0.094 -0.018 -0.282 -0.475              
## dm_tempertr -0.072 -0.020  0.138  0.040 -0.539 -0.144  0.601       
## Plot_Covr_T  0.044  0.632 -0.212 -0.365  0.201  0.010 -0.086 -0.188
parameters(platform_shannon_mod1_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(16) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.28 | 0.20 | [ 0.84,  1.71] |  6.24 | < .001
## Floral simpson index T     |        0.26 | 0.06 | [ 0.12,  0.39] |  4.01 | 0.001 
## rec time min scaled        |       -0.14 | 0.05 | [-0.24, -0.04] | -3.04 | 0.008 
## top2 ratio                 |        0.01 | 0.05 | [-0.09,  0.12] |  0.27 | 0.788 
## Site type [young_restored] |        0.56 | 0.38 | [-0.23,  1.36] |  1.50 | 0.152 
## Days since start           |       -0.16 | 0.11 | [-0.39,  0.07] | -1.46 | 0.164 
## dm wind velocity           |       -0.26 | 0.23 | [-0.74,  0.23] | -1.11 | 0.284 
## dm temperature             |       -0.58 | 0.22 | [-1.06, -0.11] | -2.60 | 0.019 
## Plot Cover T               |        0.33 | 0.07 | [ 0.17,  0.48] |  4.45 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.43 | 0.17 | [0.20, 0.93]
## SD (Residual)            |        0.20 | 0.04 | [0.14, 0.29]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod1_gauss)
## [1] FALSE
#check the model
check_model(platform_shannon_mod1_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod1_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.526
##           p-value = 0.176
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod1_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.73 [1.34, 2.56]         1.32      0.58
##     rec_time_min_scaled 1.08 [1.00, 3.04]         1.04      0.92
##              top2_ratio 1.16 [1.03, 2.02]         1.08      0.86
##               Site_type 1.57 [1.24, 2.32]         1.25      0.64
##        Days_since_start 1.45 [1.18, 2.17]         1.21      0.69
##        dm_wind_velocity 2.09 [1.56, 3.10]         1.45      0.48
##          dm_temperature 2.10 [1.57, 3.13]         1.45      0.48
##            Plot_Cover_T 2.03 [1.53, 3.01]         1.42      0.49
##  Tolerance 95% CI
##      [0.39, 0.75]
##      [0.33, 1.00]
##      [0.49, 0.97]
##      [0.43, 0.81]
##      [0.46, 0.85]
##      [0.32, 0.64]
##      [0.32, 0.64]
##      [0.33, 0.65]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod1_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod1_gauss)
plot(platform_shannon_mod1_gauss_sim_res)

#remove top2 ratio (p= 0.898  for platform_shannon_mod1_gauss)
platform_shannon_mod2_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   #+ top2_ratio
                                   + Site_type
                                   + Days_since_start
                                   + dm_wind_velocity
                                   + dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)

summary(platform_shannon_mod2_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     Site_type + Days_since_start + dm_wind_velocity + dm_temperature +  
##     Plot_Cover_T + (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 25.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1972 -0.5648 -0.2527  0.5192  1.7609 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.19480  0.4414  
##  Residual             0.03704  0.1925  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)              1.28327    0.20794   6.171
## Floral_simpson_index_T   0.26089    0.06007   4.343
## rec_time_min_scaled     -0.14550    0.04556  -3.193
## Site_typeyoung_restored  0.56646    0.38113   1.486
## Days_since_start        -0.16077    0.11174  -1.439
## dm_wind_velocity        -0.25424    0.23414  -1.086
## dm_temperature          -0.58581    0.22761  -2.574
## Plot_Cover_T             0.33653    0.06613   5.089
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__ St_ty_ Dys_s_ dm_wn_ dm_tmp
## Flrl_smp__T  0.054                                          
## rc_tm_mn_sc -0.048 -0.014                                   
## St_typyng_r -0.503  0.109 -0.029                            
## Dys_snc_str  0.132  0.013 -0.007 -0.137                     
## dm_wnd_vlct -0.174  0.021  0.091 -0.281 -0.475              
## dm_tempertr -0.070 -0.011  0.128 -0.538 -0.145  0.601       
## Plot_Covr_T  0.013  0.607 -0.194  0.200  0.021 -0.094 -0.177
parameters(platform_shannon_mod2_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(17) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.28 | 0.21 | [ 0.84,  1.72] |  6.17 | < .001
## Floral simpson index T     |        0.26 | 0.06 | [ 0.13,  0.39] |  4.34 | < .001
## rec time min scaled        |       -0.15 | 0.05 | [-0.24, -0.05] | -3.19 | 0.005 
## Site type [young_restored] |        0.57 | 0.38 | [-0.24,  1.37] |  1.49 | 0.156 
## Days since start           |       -0.16 | 0.11 | [-0.40,  0.07] | -1.44 | 0.168 
## dm wind velocity           |       -0.25 | 0.23 | [-0.75,  0.24] | -1.09 | 0.293 
## dm temperature             |       -0.59 | 0.23 | [-1.07, -0.11] | -2.57 | 0.020 
## Plot Cover T               |        0.34 | 0.07 | [ 0.20,  0.48] |  5.09 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.44 | 0.17 | [0.21, 0.93]
## SD (Residual)            |        0.19 | 0.04 | [0.13, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod2_gauss)
## [1] FALSE
#check the model
check_model(platform_shannon_mod2_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod2_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.538
##           p-value =  0.24
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod2_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.64 [1.28, 2.50]         1.28      0.61
##     rec_time_min_scaled 1.07 [1.00, 4.25]         1.04      0.93
##               Site_type 1.56 [1.23, 2.38]         1.25      0.64
##        Days_since_start 1.45 [1.17, 2.24]         1.20      0.69
##        dm_wind_velocity 2.08 [1.54, 3.18]         1.44      0.48
##          dm_temperature 2.09 [1.54, 3.18]         1.45      0.48
##            Plot_Cover_T 1.75 [1.34, 2.66]         1.32      0.57
##  Tolerance 95% CI
##      [0.40, 0.78]
##      [0.24, 1.00]
##      [0.42, 0.82]
##      [0.45, 0.86]
##      [0.31, 0.65]
##      [0.31, 0.65]
##      [0.38, 0.75]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod2_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod2_gauss)
plot(platform_shannon_mod2_gauss_sim_res)

# remove wind velocity (p= 0.293   for platform_shannon_mod2_gauss)
platform_shannon_mod3_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   #+ top2_ratio
                                   + Site_type
                                   #+ dm_wind_velocity
                                   + Days_since_start
                                   + dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)

summary(platform_shannon_mod3_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     Site_type + Days_since_start + dm_temperature + Plot_Cover_T +  
##     (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 25.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2125 -0.6575 -0.2318  0.5525  1.8034 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.20124  0.4486  
##  Residual             0.03711  0.1926  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                         Estimate Std. Error t value
## (Intercept)              1.24391    0.20790   5.983
## Floral_simpson_index_T   0.26245    0.06013   4.365
## rec_time_min_scaled     -0.14118    0.04542  -3.108
## Site_typeyoung_restored  0.45056    0.37123   1.214
## Days_since_start        -0.21841    0.09986  -2.187
## dm_temperature          -0.43741    0.18456  -2.370
## Plot_Cover_T             0.33025    0.06591   5.011
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__ St_ty_ Dys_s_ dm_tmp
## Flrl_smp__T  0.058                                   
## rc_tm_mn_sc -0.033 -0.016                            
## St_typyng_r -0.585  0.118 -0.004                     
## Dys_snc_str  0.057  0.026  0.041 -0.320              
## dm_tempertr  0.045 -0.029  0.091 -0.481  0.200       
## Plot_Covr_T -0.004  0.612 -0.188  0.180 -0.026 -0.149
parameters(platform_shannon_mod3_gauss)
## # Fixed Effects
## 
## Parameter                  | Coefficient |   SE |         95% CI | t(18) |      p
## ---------------------------------------------------------------------------------
## (Intercept)                |        1.24 | 0.21 | [ 0.81,  1.68] |  5.98 | < .001
## Floral simpson index T     |        0.26 | 0.06 | [ 0.14,  0.39] |  4.36 | < .001
## rec time min scaled        |       -0.14 | 0.05 | [-0.24, -0.05] | -3.11 | 0.006 
## Site type [young_restored] |        0.45 | 0.37 | [-0.33,  1.23] |  1.21 | 0.241 
## Days since start           |       -0.22 | 0.10 | [-0.43, -0.01] | -2.19 | 0.042 
## dm temperature             |       -0.44 | 0.18 | [-0.83, -0.05] | -2.37 | 0.029 
## Plot Cover T               |        0.33 | 0.07 | [ 0.19,  0.47] |  5.01 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.45 | 0.15 | [0.23, 0.87]
## SD (Residual)            |        0.19 | 0.04 | [0.13, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod3_gauss)
## [1] FALSE
#check the model
  
check_model(platform_shannon_mod3_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod3_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.646
##           p-value = 0.376
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod3_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.64 [1.26, 2.57]         1.28      0.61
##     rec_time_min_scaled 1.06 [1.00, 6.83]         1.03      0.94
##               Site_type 1.43 [1.15, 2.28]         1.20      0.70
##        Days_since_start 1.12 [1.01, 2.56]         1.06      0.89
##          dm_temperature 1.33 [1.09, 2.17]         1.15      0.75
##            Plot_Cover_T 1.73 [1.31, 2.71]         1.32      0.58
##  Tolerance 95% CI
##      [0.39, 0.79]
##      [0.15, 1.00]
##      [0.44, 0.87]
##      [0.39, 0.99]
##      [0.46, 0.91]
##      [0.37, 0.76]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod3_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod3_gauss)
plot(platform_shannon_mod3_gauss_sim_res)

#remove site type (p= 0.241   for platform_shannon_mod3_gauss)
platform_shannon_mod4_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   #+ top2_ratio
                                   #+ Site_type
                                   #+ dm_wind_velocity
                                   + Days_since_start
                                   + dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)
summary(platform_shannon_mod4_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     Days_since_start + dm_temperature + Plot_Cover_T + (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 26.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2791 -0.6238 -0.2754  0.5871  1.7437 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.21468  0.4633  
##  Residual             0.03735  0.1933  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                        Estimate Std. Error t value
## (Intercept)             1.39198    0.17379   8.010
## Floral_simpson_index_T  0.25463    0.05995   4.248
## rec_time_min_scaled    -0.14134    0.04559  -3.100
## Days_since_start       -0.17952    0.09752  -1.841
## dm_temperature         -0.33001    0.16676  -1.979
## Plot_Cover_T            0.31748    0.06513   4.875
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__ Dys_s_ dm_tmp
## Flrl_smp__T  0.154                            
## rc_tm_mn_sc -0.042 -0.015                     
## Dys_snc_str -0.171  0.066  0.041              
## dm_tempertr -0.333  0.031  0.099  0.055       
## Plot_Covr_T  0.124  0.605 -0.190  0.033 -0.071
parameters(platform_shannon_mod4_gauss)
## # Fixed Effects
## 
## Parameter              | Coefficient |   SE |         95% CI | t(19) |      p
## -----------------------------------------------------------------------------
## (Intercept)            |        1.39 | 0.17 | [ 1.03,  1.76] |  8.01 | < .001
## Floral simpson index T |        0.25 | 0.06 | [ 0.13,  0.38] |  4.25 | < .001
## rec time min scaled    |       -0.14 | 0.05 | [-0.24, -0.05] | -3.10 | 0.006 
## Days since start       |       -0.18 | 0.10 | [-0.38,  0.02] | -1.84 | 0.081 
## dm temperature         |       -0.33 | 0.17 | [-0.68,  0.02] | -1.98 | 0.063 
## Plot Cover T           |        0.32 | 0.07 | [ 0.18,  0.45] |  4.87 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.46 | 0.14 | [0.25, 0.85]
## SD (Residual)            |        0.19 | 0.04 | [0.14, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod4_gauss)
## [1] FALSE
#check the model
check_model(platform_shannon_mod4_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod4_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.754
##           p-value = 0.624
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod4_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF       VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.62 [1.24,     2.61]         1.27      0.62
##     rec_time_min_scaled 1.06 [1.00,     8.49]         1.03      0.94
##        Days_since_start 1.01 [1.00, 3.56e+11]         1.00      0.99
##          dm_temperature 1.02 [1.00,  5516.07]         1.01      0.98
##            Plot_Cover_T 1.68 [1.27,     2.70]         1.30      0.60
##  Tolerance 95% CI
##      [0.38, 0.81]
##      [0.12, 1.00]
##      [0.00, 1.00]
##      [0.00, 1.00]
##      [0.37, 0.79]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod4_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod4_gauss)
plot(platform_shannon_mod4_gauss_sim_res)

#remove days since start (p= 0.081   for platform_shannon_mod4_gauss)
platform_shannon_mod5_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   #+ top2_ratio
                                   #+ Site_type
                                   #+ dm_wind_velocity
                                   #+ Days_since_start
                                   + dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)
summary(platform_shannon_mod5_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     dm_temperature + Plot_Cover_T + (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 26.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1706 -0.6275 -0.2007  0.5021  1.6352 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.29615  0.5442  
##  Residual             0.03714  0.1927  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                        Estimate Std. Error t value
## (Intercept)             1.33919    0.19879   6.737
## Floral_simpson_index_T  0.26397    0.05991   4.406
## rec_time_min_scaled    -0.14058    0.04554  -3.087
## dm_temperature         -0.31542    0.19359  -1.629
## Plot_Cover_T            0.32805    0.06532   5.023
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__ dm_tmp
## Flrl_smp__T  0.145                     
## rc_tm_mn_sc -0.030 -0.016              
## dm_tempertr -0.331  0.024  0.083       
## Plot_Covr_T  0.114  0.605 -0.192 -0.063
parameters(platform_shannon_mod5_gauss)
## # Fixed Effects
## 
## Parameter              | Coefficient |   SE |         95% CI | t(20) |      p
## -----------------------------------------------------------------------------
## (Intercept)            |        1.34 | 0.20 | [ 0.92,  1.75] |  6.74 | < .001
## Floral simpson index T |        0.26 | 0.06 | [ 0.14,  0.39] |  4.41 | < .001
## rec time min scaled    |       -0.14 | 0.05 | [-0.24, -0.05] | -3.09 | 0.006 
## dm temperature         |       -0.32 | 0.19 | [-0.72,  0.09] | -1.63 | 0.119 
## Plot Cover T           |        0.33 | 0.07 | [ 0.19,  0.46] |  5.02 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.54 | 0.15 | [0.31, 0.94]
## SD (Residual)            |        0.19 | 0.04 | [0.13, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod5_gauss)
## [1] FALSE
#check the model
check_model(platform_shannon_mod5_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod5_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.869
##           p-value = 0.856
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod5_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF       VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.61 [1.22,     2.69]         1.27      0.62
##     rec_time_min_scaled 1.06 [1.00,    11.89]         1.03      0.94
##          dm_temperature 1.01 [1.00, 7.82e+06]         1.01      0.99
##            Plot_Cover_T 1.68 [1.26,     2.78]         1.30      0.60
##  Tolerance 95% CI
##      [0.37, 0.82]
##      [0.08, 1.00]
##      [0.00, 1.00]
##      [0.36, 0.80]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod5_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod5_gauss)
#plot(platform_shannon_mod5_gauss_sim_res)
testUniformity(platform_shannon_mod5_gauss_sim_res) 

## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.083556, p-value = 0.9917
## alternative hypothesis: two-sided
#remove dm temperature (p= 0.119    for platform_shannon_mod5_gauss)
platform_shannon_mod6_gauss <- lmer(shannon_diversity 
                                   ~ Floral_simpson_index_T 
                                   + rec_time_min_scaled
                                   #+ top2_ratio
                                   #+ Site_type
                                   #+ dm_wind_velocity
                                   #+ Days_since_start
                                   #+ dm_temperature
                                   + Plot_Cover_T
                                   + (1 | location), 
                                   data = platform_diversity)
summary(platform_shannon_mod6_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_T + rec_time_min_scaled +  
##     Plot_Cover_T + (1 | location)
##    Data: platform_diversity
## 
## REML criterion at convergence: 28
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1782 -0.6161 -0.1718  0.5539  1.6150 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.35804  0.5984  
##  Residual             0.03731  0.1932  
## Number of obs: 27, groups:  location, 9
## 
## Fixed effects:
##                        Estimate Std. Error t value
## (Intercept)             1.23243    0.20517   6.007
## Floral_simpson_index_T  0.26685    0.06015   4.436
## rec_time_min_scaled    -0.13656    0.04557  -2.997
## Plot_Cover_T            0.32541    0.06553   4.966
## 
## Correlation of Fixed Effects:
##             (Intr) Fl___T rc_t__
## Flrl_smp__T  0.149              
## rc_tm_mn_sc -0.002 -0.017       
## Plot_Covr_T  0.090  0.608 -0.189
parameters(platform_shannon_mod6_gauss)
## # Fixed Effects
## 
## Parameter              | Coefficient |   SE |         95% CI | t(21) |      p
## -----------------------------------------------------------------------------
## (Intercept)            |        1.23 | 0.21 | [ 0.81,  1.66] |  6.01 | < .001
## Floral simpson index T |        0.27 | 0.06 | [ 0.14,  0.39] |  4.44 | < .001
## rec time min scaled    |       -0.14 | 0.05 | [-0.23, -0.04] | -3.00 | 0.007 
## Plot Cover T           |        0.33 | 0.07 | [ 0.19,  0.46] |  4.97 | < .001
## 
## # Random Effects
## 
## Parameter                | Coefficient |   SE |       95% CI
## ------------------------------------------------------------
## SD (Intercept: location) |        0.60 | 0.16 | [0.36, 1.00]
## SD (Residual)            |        0.19 | 0.04 | [0.14, 0.28]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check for singularity
performance::check_singularity(platform_shannon_mod6_gauss)
## [1] FALSE
#check the model
check_model(platform_shannon_mod6_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(platform_shannon_mod6_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.987
##           p-value = 0.864
## No overdispersion detected.
#collinearity
check_collinearity(platform_shannon_mod6_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF    VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.61 [1.21,  2.77]         1.27      0.62
##     rec_time_min_scaled 1.05 [1.00, 23.37]         1.03      0.95
##            Plot_Cover_T 1.67 [1.24,  2.86]         1.29      0.60
##  Tolerance 95% CI
##      [0.36, 0.82]
##      [0.04, 1.00]
##      [0.35, 0.80]
# dharma package - simulate residuals and check model assumptions
platform_shannon_mod6_gauss_sim_res <- simulateResiduals(fittedModel = platform_shannon_mod6_gauss)
plot(platform_shannon_mod6_gauss_sim_res)

IV.C.3.a Compare the models with the performance package

# Compare the models with the performance package
platform_shannon_gauss_comp1 <- compare_performance(platform_shannon_mod1_gauss, platform_shannon_mod2_gauss, platform_shannon_mod3_gauss, platform_shannon_mod4_gauss, platform_shannon_mod5_gauss, platform_shannon_mod6_gauss,
                                                     platform_shannon_mod5_gauss,platform_shannon_mod6_gauss,
                                                    metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(platform_shannon_gauss_comp1)
## # Comparison of Model Performance Indices
## 
## Name                        |   Model | AICc (weights) | BIC (weights)
## ----------------------------------------------------------------------
## platform_shannon_mod6_gauss | lmerMod |   30.3 (0.400) |  33.8 (0.258)
## platform_shannon_mod5_gauss | lmerMod |   31.0 (0.273) |  34.2 (0.214)
## platform_shannon_mod1_gauss | lmerMod |   42.7 (<.001) |  39.3 (0.016)
## platform_shannon_mod2_gauss | lmerMod |   37.3 (0.012) |  36.5 (0.068)
## platform_shannon_mod3_gauss | lmerMod |   33.8 (0.067) |  34.9 (0.152)
## platform_shannon_mod4_gauss | lmerMod |   31.2 (0.247) |  33.6 (0.292)
## 
## Name                        | R2 (cond.) | R2 (marg.) |   ICC |  RMSE
## ---------------------------------------------------------------------
## platform_shannon_mod6_gauss |      0.922 |      0.170 | 0.906 | 0.146
## platform_shannon_mod5_gauss |      0.921 |      0.294 | 0.889 | 0.145
## platform_shannon_mod1_gauss |      0.913 |      0.507 | 0.824 | 0.146
## platform_shannon_mod2_gauss |      0.920 |      0.502 | 0.840 | 0.145
## platform_shannon_mod3_gauss |      0.921 |      0.490 | 0.844 | 0.145
## platform_shannon_mod4_gauss |      0.924 |      0.486 | 0.852 | 0.146

The first model platform_shannon_mod1_gauss is the only one that fits the dharma predicted vs residuals plot. The other models show a clear pattern in the residuals, indicating that they might not be a good fit for the data.

IV.C.3.b. Visualize the model results

#plot_model(platform_shannon_mod1_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_shannon_mod2_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_shannon_mod3_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_shannon_mod4_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(platform_shannon_mod5_gauss , type = "est", show.values = TRUE, value.offset = .3)
plot_model(platform_shannon_mod6_gauss , type = "est", show.values = TRUE, value.offset = .3)

(estplatshan <- plot_model(platform_shannon_mod6_gauss, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Flower Cover % per Transect",
                           "Recording time (min)",
                           "Floral Simpson Index"
                           )) +
    labs(title = "Platform Camera: Shannon Diversity of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0)))  # 0 = left, 1 = right

##PLOT COVER platform_shannon_mod6_gauss---------
# Get the original mean and SD of plot cover before scaling
plot_cover_mean <- mean(envir_data$Plot_Cover_T, na.rm = TRUE)
plot_cover_sd <- sd(envir_data$Plot_Cover_T, na.rm = TRUE)
# Get predictions on the scaled variable
#pred_plot_cover <- ggpredict(platform_shannon_mod6_gauss , terms = "Plot_Cover_T")

# Get observed scaled range
range_scaled <- range(scale(envir_data$Plot_Cover_T), na.rm = TRUE)

# Generate predictions only within that range
pred_plot_cover <- ggpredict(platform_shannon_mod6_gauss , 
                             terms = paste0("Plot_Cover_T [", round(range_scaled[1], 2), ":", round(range_scaled[2], 2), "]"))

# Unscale the x-axis
pred_plot_cover$x_unscaled <- (pred_plot_cover$x * plot_cover_sd) + plot_cover_mean
# Plot
ggplot(pred_plot_cover, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["Plot_Cover_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Plot_Cover_T"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Shannon Diversity vs Floral Cover %",
    x = "Floral Cover: average % of flower cover per transect",
    y = "Predicted Pollinator Shannon Diversity on platform cameras"
  )+
  scale_y_continuous(limits = c(0, 1.5))  # Set y-axis from 0 to 1.5
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

## floral simpson index platform_shannon_mod6_gauss---------
# Get the original mean and SD of floral simpson index before scaling
floral_simpson_mean <- mean(envir_data$Floral_simpson_index_T, na.rm = TRUE)
floral_simpson_sd <- sd(envir_data$Floral_simpson_index_T, na.rm = TRUE)
# Get predictions on the scaled variable
pred_floral_simpson <- ggpredict(platform_shannon_mod6_gauss , terms = "Floral_simpson_index_T")
# Unscale the x-axis
pred_floral_simpson$x_unscaled <- (pred_floral_simpson$x * floral_simpson_sd) + floral_simpson_mean
# Plot
(florplatshan <-ggplot(pred_floral_simpson, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["Floral_simpson_index_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Floral_simpson_index_T"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Shannon Diversity vs Floral Simpson Index",
    x = "Floral Simpson Index",
    y = "Predicted Pollinator Shannon Diversity \non platform cameras"
  ))

## rec time min platform_shannon_mod6_gauss---------
# Get the original mean and SD of rec time min before scaling
rec_time_min_mean <- mean(platform_diversity$rec_time_min, na.rm = TRUE)
rec_time_min_sd <- sd(platform_diversity$rec_time_min, na.rm = TRUE)
# Get predictions on the scaled variable
pred_rec_time_min <- ggpredict(platform_shannon_mod6_gauss , terms = "rec_time_min_scaled")
# Unscale the x-axis
pred_rec_time_min$x_unscaled <- (pred_rec_time_min$x * rec_time_min_sd) + rec_time_min_mean
# Plot
ggplot(pred_rec_time_min, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["rec_time_min"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["rec_time_min"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Shannon Diversity vs Recording time",
    x = "Recording time (minutes)",
    y = "Predicted Pollinator Shannon Diversity on platform cameras"
  )

#combine plots florplatshan and estplatshan
library(cowplot)

florplatshan1 <- florplatshan + 
  #change title
  labs(title = "Predicted Pollinator Shannon Diversity vs Floral Simpson Index")+
  #title font to 14
  theme(plot.title = element_text(size = 14))
estplatshan1 <- estplatshan + 
  #change title
  labs(title = "Pollinator Shannon Diversity") +
  theme(plot.title = element_text(size = 14))

# Step 1: Combine the plots and label ONLY them (A and B)
combined_plots <- cowplot::plot_grid(
  estplatshan1, florplatshan1,
  ncol = 2,
  labels = c("A", "B"),   # Label just these two plots
  label_size = 14, 
  rel_widths = c(1, 1.1) # Equal width for both plots
)

# Step 2: Add the title separately, so it doesn't get a label
(final_plot <- cowplot::plot_grid(
  ggdraw() + draw_label(
    "Platform Cameras: Pollinator Shannon Diversity", 
    fontface = 'bold', size = 14, x = 0.5, hjust = 0.5
  ),
  combined_plots,
  ncol = 1,
  rel_heights = c(0.1, 1)  # Title height vs. plots height
))

#save the plot
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/platform_shannon_diversity.png", 
       width = 12, height = 6, dpi = 600)



#rm(florplatshan,estplatshan , florplatshan1, estplatshan1)

IV.C.3.c. Interpretation of the model results

#remove all objects starting with platform_shannon_
rm(list = ls(pattern = "^platform_shannon_"))

IV.C.4. PLATFORM CAMERAS Simpson - beta regression

#histogram of simpson index
platform_diversity %>%
  ggplot(aes(x = simpson_diversity)) +
  geom_histogram(binwidth = 0.1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Insect Simpson Diversity",
       x = "Insect Simpson Diversity",
       y = "Count")

#testing the normality of the simpson index
shapiro.test(platform_diversity$simpson_diversity) # p-value = 0.001222, simpson index is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  platform_diversity$simpson_diversity
## W = 0.8513, p-value = 0.001222
datawizard::describe_distribution(platform_diversity$simpson_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 0.55 | 0.25 | 0.36 | [0.00, 0.85] |    -1.13 |     0.35 | 27 |         0
#adding a small number to all simpson index values
platform_diversity <- platform_diversity %>%
  mutate(simpson_diversity = simpson_diversity + 0.0001)

#histogram of rec_time_min
platform_diversity %>%
  ggplot(aes(x = rec_time_min)) +
  geom_histogram(fill = "lightblue", color = "black") +
  labs(title = "Histogram of Recording Time",
       x = "Recording Time (minutes)",
       y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#testing the normality of the rec_time_min
shapiro.test(platform_diversity$rec_time_min) # p-value = , rec_time_min is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  platform_diversity$rec_time_min
## W = 0.55695, p-value = 6.719e-08
datawizard::describe_distribution(platform_diversity$rec_time_min)
##   Mean |    SD |   IQR |           Range | Skewness | Kurtosis |  n | n_Missing
## -------------------------------------------------------------------------------
## 324.90 | 80.86 | 72.72 | [66.21, 394.75] |    -2.59 |     6.47 | 27 |         0
summary(platform_diversity$rec_time_min)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   66.21  323.70  360.06  324.90  360.07  394.75

The data for simpson diversity captured with the platform cameras is not normally distributed (shapiro test: p-value = 0.001222) and has a skewness of -1.13, indicating a left skew. The kurtosis is 0.35, slightly peaked. Since the values are between 0 and 1, we’ll use a beta distribution. However, we need to add a small number to all values to avoid 0 and 1 values, as the beta distribution cannot handle them. We will add 0.000001 to all values.

“Model fitting was performed using the glmmTMB package (Brooks et al., 2017), with the BFGS optimization method (optim(method =”BFGS”)) specified to improve convergence. BFGS is a quasi-Newton optimization algorithm that uses both gradient and approximated Hessian information to locate the maximum likelihood estimates more efficiently, particularly in non-linear, constrained optimization problems such as beta regression. This method was chosen after initial fitting attempts using the default optimizer resulted in convergence warnings and NaN function evaluations. Using BFGS yielded stable estimates and improved model diagnostics, without overfitting or inflated standard errors.”

# full model with insect simpson diversity as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect, and recording time is included to account for sampling effort differences

platform_simpson_mod1_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       + rec_time_min
                                       + top2_ratio
                                       + Site_type
                                       + Days_since_start
                                       + dm_wind_velocity 
                                       + dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)
summary(platform_simpson_mod1_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + rec_time_min + top2_ratio +  
##     Site_type + Days_since_start + dm_wind_velocity + dm_temperature +  
##     Plot_Cover_T + (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -28.0    -13.8     25.0    -50.0       16 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.2309   0.4806  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family (): 8.77 
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              3.333325   0.899504   3.706 0.000211 ***
## Floral_simpson_index_T   0.631193   0.224664   2.809 0.004962 ** 
## rec_time_min            -0.008915   0.002385  -3.738 0.000186 ***
## top2_ratio               0.381772   0.183954   2.075 0.037952 *  
## Site_typeyoung_restored  1.616338   0.640539   2.523 0.011622 *  
## Days_since_start        -0.446590   0.163793  -2.727 0.006400 ** 
## dm_wind_velocity        -0.900100   0.346481  -2.598 0.009381 ** 
## dm_temperature          -1.496801   0.383002  -3.908  9.3e-05 ***
## Plot_Cover_T             0.737071   0.290527   2.537 0.011180 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod1_beta)
## # Fixed Effects
## 
## Parameter                  | Coefficient |       SE |         95% CI |     z |      p
## -------------------------------------------------------------------------------------
## (Intercept)                |        3.33 |     0.90 | [ 1.57,  5.10] |  3.71 | < .001
## Floral simpson index T     |        0.63 |     0.22 | [ 0.19,  1.07] |  2.81 | 0.005 
## rec time min               |   -8.91e-03 | 2.39e-03 | [-0.01,  0.00] | -3.74 | < .001
## top2 ratio                 |        0.38 |     0.18 | [ 0.02,  0.74] |  2.08 | 0.038 
## Site type [young_restored] |        1.62 |     0.64 | [ 0.36,  2.87] |  2.52 | 0.012 
## Days since start           |       -0.45 |     0.16 | [-0.77, -0.13] | -2.73 | 0.006 
## dm wind velocity           |       -0.90 |     0.35 | [-1.58, -0.22] | -2.60 | 0.009 
## dm temperature             |       -1.50 |     0.38 | [-2.25, -0.75] | -3.91 | < .001
## Plot Cover T               |        0.74 |     0.29 | [ 0.17,  1.31] |  2.54 | 0.011 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.77 | [4.33, 17.76]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.48 | [0.16, 1.40]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod1_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod1_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod1_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.038
##           p-value = 0.848
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod1_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.96 [1.51, 2.83]         1.40      0.51
##            rec_time_min 1.37 [1.14, 2.01]         1.17      0.73
##              top2_ratio 1.44 [1.18, 2.09]         1.20      0.70
##               Site_type 2.22 [1.67, 3.22]         1.49      0.45
##        Days_since_start 1.47 [1.20, 2.13]         1.21      0.68
##        dm_wind_velocity 2.31 [1.73, 3.35]         1.52      0.43
##          dm_temperature 3.03 [2.20, 4.45]         1.74      0.33
##            Plot_Cover_T 3.15 [2.27, 4.63]         1.77      0.32
##  Tolerance 95% CI
##      [0.35, 0.66]
##      [0.50, 0.88]
##      [0.48, 0.85]
##      [0.31, 0.60]
##      [0.47, 0.84]
##      [0.30, 0.58]
##      [0.22, 0.45]
##      [0.22, 0.44]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod1_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod1_beta)
plot(platform_simpson_mod1_beta_sim_res)
## qu = 0.75, log(sigma) = -3.178449 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.201513 : outer Newton did not converge fully.

#remove top2 ratio (p= 0.060      for platform_simpson_mod1_beta)
platform_simpson_mod2_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       # top2_ratio
                                       + rec_time_min
                                       + Site_type
                                       + Days_since_start
                                       + dm_wind_velocity 
                                       + dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)
summary(platform_simpson_mod2_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + rec_time_min + Site_type +  
##     Days_since_start + dm_wind_velocity + dm_temperature + Plot_Cover_T +  
##     (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -25.6    -12.7     22.8    -45.6       17 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.5128   0.7161  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family (): 8.87 
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              3.819051   0.834995   4.574 4.79e-06 ***
## Floral_simpson_index_T   0.698524   0.231153   3.022 0.002512 ** 
## rec_time_min            -0.009970   0.002117  -4.709 2.49e-06 ***
## Site_typeyoung_restored  1.632280   0.758180   2.153 0.031327 *  
## Days_since_start        -0.456857   0.206812  -2.209 0.027171 *  
## dm_wind_velocity        -0.923361   0.441320  -2.092 0.036414 *  
## dm_temperature          -1.638459   0.453394  -3.614 0.000302 ***
## Plot_Cover_T             0.995462   0.278243   3.578 0.000347 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod2_beta)
## # Fixed Effects
## 
## Parameter                  | Coefficient |       SE |         95% CI |     z |      p
## -------------------------------------------------------------------------------------
## (Intercept)                |        3.82 |     0.83 | [ 2.18,  5.46] |  4.57 | < .001
## Floral simpson index T     |        0.70 |     0.23 | [ 0.25,  1.15] |  3.02 | 0.003 
## rec time min               |   -9.97e-03 | 2.12e-03 | [-0.01, -0.01] | -4.71 | < .001
## Site type [young_restored] |        1.63 |     0.76 | [ 0.15,  3.12] |  2.15 | 0.031 
## Days since start           |       -0.46 |     0.21 | [-0.86, -0.05] | -2.21 | 0.027 
## dm wind velocity           |       -0.92 |     0.44 | [-1.79, -0.06] | -2.09 | 0.036 
## dm temperature             |       -1.64 |     0.45 | [-2.53, -0.75] | -3.61 | < .001
## Plot Cover T               |        1.00 |     0.28 | [ 0.45,  1.54] |  3.58 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.87 | [4.68, 16.83]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.72 | [0.38, 1.37]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod2_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod2_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod2_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.184
##           p-value = 0.472
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod2_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.77 [1.36, 2.61]         1.33      0.57
##            rec_time_min 1.16 [1.02, 2.03]         1.08      0.86
##               Site_type 1.85 [1.42, 2.74]         1.36      0.54
##        Days_since_start 1.44 [1.17, 2.15]         1.20      0.69
##        dm_wind_velocity 2.22 [1.65, 3.31]         1.49      0.45
##          dm_temperature 2.52 [1.84, 3.77]         1.59      0.40
##            Plot_Cover_T 2.10 [1.57, 3.12]         1.45      0.48
##  Tolerance 95% CI
##      [0.38, 0.73]
##      [0.49, 0.98]
##      [0.36, 0.71]
##      [0.46, 0.86]
##      [0.30, 0.61]
##      [0.27, 0.54]
##      [0.32, 0.64]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod2_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod2_beta)
plot(platform_simpson_mod2_beta_sim_res)
## qu = 0.75, log(sigma) = -3.537205 : outer Newton did not converge fully.

#remove wind velocity (p= 0.040   for platform_simpson_mod2_beta)
platform_simpson_mod3_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       # top2_ratio
                                       + Site_type
                                       + rec_time_min
                                       #+ dm_wind_velocity 
                                       + Days_since_start
                                       + dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)

summary(platform_simpson_mod3_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + Site_type + rec_time_min +  
##     Days_since_start + dm_temperature + Plot_Cover_T + (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -23.9    -12.2     21.0    -41.9       18 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 0.837    0.9149  
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family ():  8.8 
## 
## Conditional model:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              3.592149   0.867561   4.141 3.47e-05 ***
## Floral_simpson_index_T   0.731698   0.232468   3.148 0.001647 ** 
## Site_typeyoung_restored  1.192663   0.848305   1.406 0.159743    
## rec_time_min            -0.009695   0.002148  -4.513 6.40e-06 ***
## Days_since_start        -0.655586   0.224899  -2.915 0.003557 ** 
## dm_temperature          -1.082989   0.416091  -2.603 0.009247 ** 
## Plot_Cover_T             0.937094   0.272923   3.434 0.000596 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod3_beta)
## # Fixed Effects
## 
## Parameter                  | Coefficient |       SE |         95% CI |     z |      p
## -------------------------------------------------------------------------------------
## (Intercept)                |        3.59 |     0.87 | [ 1.89,  5.29] |  4.14 | < .001
## Floral simpson index T     |        0.73 |     0.23 | [ 0.28,  1.19] |  3.15 | 0.002 
## Site type [young_restored] |        1.19 |     0.85 | [-0.47,  2.86] |  1.41 | 0.160 
## rec time min               |   -9.69e-03 | 2.15e-03 | [-0.01, -0.01] | -4.51 | < .001
## Days since start           |       -0.66 |     0.22 | [-1.10, -0.21] | -2.92 | 0.004 
## dm temperature             |       -1.08 |     0.42 | [-1.90, -0.27] | -2.60 | 0.009 
## Plot Cover T               |        0.94 |     0.27 | [ 0.40,  1.47] |  3.43 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.80 | [4.65, 16.65]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        0.91 | [0.51, 1.65]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod3_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod3_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod3_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.167
##           p-value = 0.536
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod3_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.74 [1.33, 2.64]         1.32      0.58
##               Site_type 1.57 [1.23, 2.40]         1.25      0.64
##            rec_time_min 1.12 [1.01, 2.47]         1.06      0.90
##        Days_since_start 1.17 [1.03, 2.11]         1.08      0.85
##          dm_temperature 1.42 [1.15, 2.19]         1.19      0.71
##            Plot_Cover_T 1.93 [1.45, 2.94]         1.39      0.52
##  Tolerance 95% CI
##      [0.38, 0.75]
##      [0.42, 0.81]
##      [0.41, 0.99]
##      [0.47, 0.97]
##      [0.46, 0.87]
##      [0.34, 0.69]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod3_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod3_beta)
plot(platform_simpson_mod3_beta_sim_res)
## qu = 0.75, log(sigma) = -2.814391 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.521389 : outer Newton did not converge fully.

#removing site type
platform_simpson_mod4_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       # top2_ratio
                                       #+ Site_type
                                       + rec_time_min
                                       #+ dm_wind_velocity 
                                       + Days_since_start
                                       + dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)
summary(platform_simpson_mod4_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + rec_time_min + Days_since_start +  
##     dm_temperature + Plot_Cover_T + (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -24.0    -13.7     20.0    -40.0       19 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 1.025    1.012   
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family (): 8.61 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             3.993548   0.868141   4.600 4.22e-06 ***
## Floral_simpson_index_T  0.665318   0.231014   2.880  0.00398 ** 
## rec_time_min           -0.009854   0.002176  -4.529 5.94e-06 ***
## Days_since_start       -0.549336   0.228198  -2.407  0.01607 *  
## dm_temperature         -0.803849   0.390979  -2.056  0.03978 *  
## Plot_Cover_T            0.857566   0.280181   3.061  0.00221 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod4_beta)
## # Fixed Effects
## 
## Parameter              | Coefficient |       SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)            |        3.99 |     0.87 | [ 2.29,  5.70] |  4.60 | < .001
## Floral simpson index T |        0.67 |     0.23 | [ 0.21,  1.12] |  2.88 | 0.004 
## rec time min           |   -9.85e-03 | 2.18e-03 | [-0.01, -0.01] | -4.53 | < .001
## Days since start       |       -0.55 |     0.23 | [-1.00, -0.10] | -2.41 | 0.016 
## dm temperature         |       -0.80 |     0.39 | [-1.57, -0.04] | -2.06 | 0.040 
## Plot Cover T           |        0.86 |     0.28 | [ 0.31,  1.41] |  3.06 | 0.002 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.61 | [4.51, 16.42]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        1.01 | [0.56, 1.84]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod4_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod4_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod4_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.126
##           p-value = 0.632
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod4_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF       VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.71 [1.30,     2.67]         1.31      0.59
##            rec_time_min 1.10 [1.01,     2.93]         1.05      0.91
##        Days_since_start 1.02 [1.00, 16571.82]         1.01      0.98
##          dm_temperature 1.06 [1.00,     8.66]         1.03      0.95
##            Plot_Cover_T 1.82 [1.37,     2.85]         1.35      0.55
##  Tolerance 95% CI
##      [0.37, 0.77]
##      [0.34, 0.99]
##      [0.00, 1.00]
##      [0.12, 1.00]
##      [0.35, 0.73]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod4_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod4_beta)
plot(platform_simpson_mod4_beta_sim_res)
## qu = 0.75, log(sigma) = -2.670042 : outer Newton did not converge fully.

#remove temperature
platform_simpson_mod5_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       # top2_ratio
                                       #+ Site_type
                                       + rec_time_min
                                       #+ dm_wind_velocity 
                                       + Days_since_start
                                       #+ dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)
summary(platform_simpson_mod5_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + rec_time_min + Days_since_start +  
##     Plot_Cover_T + (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -22.4    -13.4     18.2    -36.4       20 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 1.622    1.274   
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family (): 8.69 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             3.695561   0.884239   4.179 2.92e-05 ***
## Floral_simpson_index_T  0.713430   0.230630   3.093  0.00198 ** 
## rec_time_min           -0.009716   0.002199  -4.417 9.99e-06 ***
## Days_since_start       -0.525967   0.277055  -1.898  0.05764 .  
## Plot_Cover_T            0.884628   0.275729   3.208  0.00134 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod5_beta)
## # Fixed Effects
## 
## Parameter              | Coefficient |       SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)            |        3.70 |     0.88 | [ 1.96,  5.43] |  4.18 | < .001
## Floral simpson index T |        0.71 |     0.23 | [ 0.26,  1.17] |  3.09 | 0.002 
## rec time min           |   -9.72e-03 | 2.20e-03 | [-0.01, -0.01] | -4.42 | < .001
## Days since start       |       -0.53 |     0.28 | [-1.07,  0.02] | -1.90 | 0.058 
## Plot Cover T           |        0.88 |     0.28 | [ 0.34,  1.43] |  3.21 | 0.001 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.69 | [4.57, 16.52]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        1.27 | [0.73, 2.23]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod5_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod5_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod5_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.151
##           p-value = 0.632
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod5_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF       VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.65 [1.25,     2.66]         1.28      0.61
##            rec_time_min 1.09 [1.00,     3.87]         1.04      0.92
##        Days_since_start 1.01 [1.00, 1.48e+11]         1.00      0.99
##            Plot_Cover_T 1.76 [1.31,     2.82]         1.33      0.57
##  Tolerance 95% CI
##      [0.38, 0.80]
##      [0.26, 1.00]
##      [0.00, 1.00]
##      [0.35, 0.76]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod5_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod5_beta)
plot(platform_simpson_mod5_beta_sim_res)
## qu = 0.5, log(sigma) = -3.146502 : outer Newton did not converge fully.

#remove days since start
platform_simpson_mod6_beta <- glmmTMB(simpson_diversity 
                                       ~ Floral_simpson_index_T 
                                       # top2_ratio
                                       #+ Site_type
                                       + rec_time_min
                                       #+ dm_wind_velocity 
                                       #+ Days_since_start
                                       #+ dm_temperature 
                                       + Plot_Cover_T
                                       + (1 | location), 
                                     family = beta_family(link = "logit"), 
                                     control = glmmTMBControl(optimizer = optim, optArgs = list(method = "BFGS")),
                                     data = platform_diversity)
summary(platform_simpson_mod6_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_T + rec_time_min + Plot_Cover_T +  
##     (1 | location)
## Data: platform_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -21.4    -13.6     16.7    -33.4       21 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name        Variance Std.Dev.
##  location (Intercept) 2.393    1.547   
## Number of obs: 27, groups:  location, 9
## 
## Dispersion parameter for beta family ():  8.7 
## 
## Conditional model:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             3.597587   0.904518   3.977 6.97e-05 ***
## Floral_simpson_index_T  0.740144   0.232369   3.185 0.001447 ** 
## rec_time_min           -0.009834   0.002147  -4.581 4.62e-06 ***
## Plot_Cover_T            0.935617   0.269710   3.469 0.000522 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(platform_simpson_mod6_beta)
## # Fixed Effects
## 
## Parameter              | Coefficient |       SE |         95% CI |     z |      p
## ---------------------------------------------------------------------------------
## (Intercept)            |        3.60 |     0.90 | [ 1.82,  5.37] |  3.98 | < .001
## Floral simpson index T |        0.74 |     0.23 | [ 0.28,  1.20] |  3.19 | 0.001 
## rec time min           |   -9.83e-03 | 2.15e-03 | [-0.01, -0.01] | -4.58 | < .001
## Plot Cover T           |        0.94 |     0.27 | [ 0.41,  1.46] |  3.47 | < .001
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        8.70 | [4.59, 16.48]
## 
## # Random Effects Variances
## 
## Parameter                | Coefficient |       95% CI
## -----------------------------------------------------
## SD (Intercept: location) |        1.55 | [0.91, 2.63]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(platform_simpson_mod6_beta)
## [1] FALSE
#check the model
check_model(platform_simpson_mod6_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(platform_simpson_mod6_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.034
##           p-value = 0.856
## No overdispersion detected.
#collinearity
check_collinearity(platform_simpson_mod6_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                    Term  VIF   VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_T 1.63 [1.23, 2.71]         1.28      0.61
##            rec_time_min 1.08 [1.00, 6.13]         1.04      0.93
##            Plot_Cover_T 1.72 [1.28, 2.85]         1.31      0.58
##  Tolerance 95% CI
##      [0.37, 0.81]
##      [0.16, 1.00]
##      [0.35, 0.78]
# dharma package - simulate residuals and check model assumptions
platform_simpson_mod6_beta_sim_res <- simulateResiduals(fittedModel = platform_simpson_mod6_beta)
plot(platform_simpson_mod6_beta_sim_res)

IV.C.4.a. Compare the models with the performance package

# Compare the models with the performance package
platform_simpson_beta_comp1 <- compare_performance(platform_simpson_mod1_beta, platform_simpson_mod2_beta, platform_simpson_mod3_beta, platform_simpson_mod4_beta, platform_simpson_mod5_beta, platform_simpson_mod6_beta,
                                                  metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(platform_simpson_beta_comp1)
## # Comparison of Model Performance Indices
## 
## Name                       |   Model | AICc (weights) | BIC (weights)
## ---------------------------------------------------------------------
## platform_simpson_mod1_beta | glmmTMB |  -10.4 (0.013) | -13.8 (0.211)
## platform_simpson_mod2_beta | glmmTMB |  -11.9 (0.028) | -12.7 (0.122)
## platform_simpson_mod3_beta | glmmTMB |  -13.3 (0.057) | -12.2 (0.099)
## platform_simpson_mod4_beta | glmmTMB |  -16.0 (0.222) | -13.7 (0.201)
## platform_simpson_mod5_beta | glmmTMB |  -16.5 (0.287) | -13.4 (0.174)
## platform_simpson_mod6_beta | glmmTMB |  -17.2 (0.393) | -13.6 (0.194)
## 
## Name                       | R2 (cond.) | R2 (marg.) |   ICC |  RMSE
## --------------------------------------------------------------------
## platform_simpson_mod1_beta |      0.951 |      0.854 | 0.666 | 0.117
## platform_simpson_mod2_beta |      0.956 |      0.758 | 0.817 | 0.111
## platform_simpson_mod3_beta |      0.960 |      0.669 | 0.879 | 0.106
## platform_simpson_mod4_beta |      0.962 |      0.628 | 0.897 | 0.108
## platform_simpson_mod5_beta |      0.967 |      0.508 | 0.933 | 0.106
## platform_simpson_mod6_beta |      0.967 |      0.294 | 0.953 | 0.106

IV.C.4.b. Visualize the model results

plot_model(platform_simpson_mod1_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod2_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod3_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod4_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod5_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod6_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(platform_simpson_mod6_beta, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c("Flower Cover % per Transect",
                           "Recording time (min)",
                           "Floral Simpson Index"
                           )) +
    labs(title = "Platform Camera: Simpson Diversity of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

##PLOT COVER platform_simpson_mod6_beta---------
# Get the original mean and SD of plot cover before scaling
plot_cover_mean <- mean(envir_data$Plot_Cover_T, na.rm = TRUE)
plot_cover_sd <- sd(envir_data$Plot_Cover_T, na.rm = TRUE)
# Get predictions on the scaled variable
pred_plot_cover <- ggpredict(platform_simpson_mod6_beta , terms = "Plot_Cover_T")
# Unscale the x-axis
pred_plot_cover$x_unscaled <- (pred_plot_cover$x * plot_cover_sd) + plot_cover_mean
# Plot
ggplot(pred_plot_cover, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["Plot_Cover_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Plot_Cover_T"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Simpson Diversity vs Floral Cover %",
    x = "Floral Cover: average % of flower cover per transect",
    y = "Predicted Pollinator Simpson Diversity on platform cameras"
  )+
  #limit x axis to 0-100
  scale_x_continuous(limits = c(0, 120))

## floral simpson index platform_simpson_mod6_beta---------
# Get the original mean and SD of floral simpson index before scaling
floral_simpson_mean <- mean(envir_data$Floral_simpson_index_T, na.rm = TRUE)
floral_simpson_sd <- sd(envir_data$Floral_simpson_index_T, na.rm = TRUE)
# Get predictions on the scaled variable
pred_floral_simpson <- ggpredict(platform_simpson_mod6_beta , terms = "Floral_simpson_index_T")
# Unscale the x-axis
pred_floral_simpson$x_unscaled <- (pred_floral_simpson$x * floral_simpson_sd) + floral_simpson_mean
# Plot
ggplot(pred_floral_simpson, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["Floral_simpson_index_T"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["Floral_simpson_index_T"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Simpson Diversity vs Floral Simpson Index",
    x = "Floral Simpson Index",
    y = "Predicted Pollinator Simpson Diversity on platform cameras"
  )

## Rec time min platform_simpson_mod6_beta---------
# Get the original mean and SD of rec time min before scaling
rec_time_min_mean <- mean(platform_diversity$rec_time_min, na.rm = TRUE)
rec_time_min_sd <- sd(platform_diversity$rec_time_min, na.rm = TRUE)
# Get predictions on the scaled variable
pred_rec_time_min <- ggpredict(platform_simpson_mod6_beta , terms = "rec_time_min")
# Unscale the x-axis
pred_rec_time_min$x_unscaled <- (pred_rec_time_min$x * rec_time_min_sd) + rec_time_min_mean
# Plot
ggplot(pred_rec_time_min, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["rec_time_min"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["rec_time_min"]], 0.5)) +
  labs(
    title = "Platform: Predicted Insect Simpson Diversity vs Recording time",
    x = "Recording time (minutes)",
    y = "Predicted Pollinator Simpson Diversity on platform cameras"
  )

IV.C.4.c. Interpretation of the model results

#remove all objects starting with platform_simpson_
rm(list = ls(pattern = "^platform_simpson_"))

IV.D. Flower cameras

The flower cameras images were taken every 5 seconds, so to account for some of the repeats in the images, we will group together the images taken in the same minute and classified as the same family. We will also remove the images that are not classified as a family, and those that are classified as other families. For now, we put the family confidence at a threshold of 0.5, but we can change that later if needed.

In the platform cameras, we took recording time into account because not all sessions were the same length. For the flower cameras, the recording either worked well for the whole duration (360 minutes), or simply never started. That is why instead of having 5 flowers x 9 sites = 45 rows, we have 40 rows. The missing recordings are as follows: JEP cam35 hyp_per KOT cam35 pas_sat WED cam35 pas_sat STP cam31 dau_car DES cam35 tri_pra

The flower cameras were not placed on specific transects, but on individual flowers that were representative of the most abundant flower species in the area. We need to calculate the average flower cover per site instead of per transect, as well as the average floral Simpson index per site.

# flower camera data
flower_camera_modelling <- flower_camera %>%
  #remove all rows in calssification categroy that are other_families
  filter(Classification_Category != "other_families") %>%
  #remove all rows that are below 0.5 in Family_Confidence
  filter(Family_Confidence >= 0.5) %>%
  #transform time as character 
  mutate(time = as.character(time)) %>%
  #transform specific row 08:59:13 to 09:01:00
  mutate(time = ifelse(time == "08:59:13", "09:01:00", time)) %>%
  #remove the last 3 characters of the time column
  mutate(time = substr(time, 1, nchar(time) - 3)) %>%
  #split the time column into two columns: hours and minutes
  separate(time, into = c("hours", "minutes"), sep = ":") %>%
  #convert the hours and minutes columns to numeric
  mutate(hours = as.numeric(hours),
         minutes = as.numeric(minutes)) %>%
  # transfrom time into minutes since 9am
  mutate(minutes_since_9am = (hours - 9) * 60 + minutes)
  
  
flower_camera_famcount_reduced <- flower_camera_modelling %>%
  #keep only first occurrence of a row where site, flower_sp, family and minutes since 9am are the same
  #this will remove the duplicates in the same minute
  distinct(site, date, cam, flower_sp, minutes_since_9am, Family, .keep_all = TRUE) %>%
  
  #group by site, flower_sp, family and minutes since 9am
  group_by(site, date , cam, flower_sp, Family) %>%
  #count the number of images per minute per family
  summarise(count = n(), .groups = 'drop')


#final count dataframe
flower_cam_count_full <- flower_camera_modelling %>%
  #summarize the number of images per site per flower_sp
  group_by(site, date , cam, flower_sp) %>%
  summarise(count = n(), .groups = 'drop')

#plant survey data
planty <- relative_flower %>%
  dplyr::select(Site, Transect, average_flower_cover, Floral_simpson_index)%>%
  #average of Floral  simpson index per site
  group_by(Site,average_flower_cover) %>%
  summarise(Floral_simpson_index_site = mean(Floral_simpson_index), .groups = 'drop')
  

flower_cam_count_full <- envir_data %>%
  #remove data irrelevant for flower camera, such as transect, minutes since 9am, floral simpson index
  dplyr::select(-c(Transect, minutes_since_9am,Floral_simpson_index_T, Pastinaca.sativa, Daucus.carota, top2_ratio, Plot_Cover_T)) %>%
  #join the flower camera data
  left_join(flower_cam_count_full, by = c("Site"= "site","Date"= "date"))%>%
  distinct()%>%
  #join the plant survey data
  left_join(planty, by = c("Site"= "Site", "average_flower_cover"))%>%
  #scale the environmental data
  mutate(across(c(dm_wind_velocity, dm_temperature, average_flower_cover,agri,grass,snh,forest,urban,water,Days_since_start,Floral_simpson_index_site), scale))
## Warning in left_join(., flower_cam_count_full, by = c(Site = "site", Date = "date")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 33 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
flower_cam_count_reduced <- flower_camera_famcount_reduced %>%
  #summarize the number of images per site per flower_sp
  group_by(site, date , cam, flower_sp) %>%
  summarise(count = n(), .groups = 'drop')%>%
  #join the plant survey data
  left_join(planty, by = c("site"= "Site"))%>%
  #join environmental data
  left_join(envir_data, by = c("site"= "Site","date"= "Date", "average_flower_cover"))%>%
  #remove 
  dplyr::select(-c(Plot_Cover_T, Floral_simpson_index_T, Transect, minutes_since_9am, Pastinaca.sativa, Daucus.carota, top2_ratio, majority_class, urban, agri, snh, grass,water, forest))%>%
  #scale the environmental data
  mutate(across(c(dm_wind_velocity, dm_temperature, average_flower_cover,Days_since_start,average_flower_cover, Floral_simpson_index_site), scale))%>%
  distinct()
## Warning in left_join(., envir_data, by = c(site = "Site", date = "Date", : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 41 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
sum(flower_cam_count_full$count);sum(flower_cam_count_reduced$count)
## [1] 11814
## [1] 769
sum(flower_cam_count_reduced$count)/sum(flower_cam_count_full$count)
## [1] 0.06509226

flower_cam_count1$count: W = 0.73, p-value = 3.35e-07 → Strong evidence against normality. Very right-skewed (Skewness = 1.91) and heavy-tailed (Kurtosis = 3.17). Wide range: from 2 to 1503, high variance (SD = 388.98).

flower_cam_count2$count: W = 0.89, p-value = 0.00077 → Also not normal. Still skewed (Skewness = 0.90), but much more manageable. Lower range: 2 to 57, smaller variance (SD = 15.20).

IV.D.1. FLOWER CAMERA FULL Count - NB

#histogram of the number of images per flower camera
flower_cam_count_full %>%
  ggplot(aes(x = count)) +
  geom_histogram(binwidth=0.5,fill = "lightblue", color = "black") +
  labs(title = "Histogram of Number of Images per Flower Camera",
       x = "Number of Images",
       y = "Count")

#testin normality of the number of images per flower camera
shapiro.test(flower_cam_count_full$count) # p-value = 3.35e-07, number of images per flower camera is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  flower_cam_count_full$count
## W = 0.73163, p-value = 3.35e-07
datawizard::describe_distribution(flower_cam_count_full$count)
##   Mean |     SD |    IQR |           Range | Skewness | Kurtosis |  n | n_Missing
## ---------------------------------------------------------------------------------
## 295.35 | 388.98 | 373.75 | [2.00, 1503.00] |     1.91 |     3.17 | 40 |         0

Since it is a count response variable, we will use a Poisson distribution.

# full model with insect abundance as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect, and recording time is included to account for sampling effort differences
flower_abundance_mod1_nb <- glmmTMB(count 
                                    ~ Site_type 
                                    #+ flower_sp
                                    + average_flower_cover 
                                    + Floral_simpson_index_site 
                                    + Days_since_start  
                                    + dm_wind_velocity  
                                    + dm_temperature  
                                    + (1 | Site),
                                    family = nbinom2,
                                    data = flower_cam_count_full
                                    )
summary(flower_abundance_mod1_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ Site_type + average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + dm_temperature + (1 |      Site)
## Data: flower_cam_count_full
## 
##      AIC      BIC   logLik deviance df.resid 
##    527.0    542.2   -254.5    509.0       31 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev.
##  Site   (Intercept) 2.767e-09 5.26e-05
## Number of obs: 40, groups:  Site, 9
## 
## Dispersion parameter for nbinom2 family (): 0.859 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 5.4487     0.8027   6.788 1.14e-11 ***
## Site_typeyoung_restored    -0.1823     1.7422  -0.105    0.917    
## average_flower_cover       -1.0996     1.1981  -0.918    0.359    
## Floral_simpson_index_site  -0.5371     0.6074  -0.884    0.377    
## Days_since_start            0.2198     0.2358   0.932    0.351    
## dm_wind_velocity           -0.4945     0.7364  -0.671    0.502    
## dm_temperature             -0.3760     1.1753  -0.320    0.749    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_abundance_mod1_nb)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                  | Log-Mean |   SE |        95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)                |     5.45 | 0.80 | [ 3.88, 7.02] |  6.79 | < .001
## Site type [young_restored] |    -0.18 | 1.74 | [-3.60, 3.23] | -0.10 | 0.917 
## average flower cover       |    -1.10 | 1.20 | [-3.45, 1.25] | -0.92 | 0.359 
## Floral simpson index site  |    -0.54 | 0.61 | [-1.73, 0.65] | -0.88 | 0.377 
## Days since start           |     0.22 | 0.24 | [-0.24, 0.68] |  0.93 | 0.351 
## dm wind velocity           |    -0.49 | 0.74 | [-1.94, 0.95] | -0.67 | 0.502 
## dm temperature             |    -0.38 | 1.18 | [-2.68, 1.93] | -0.32 | 0.749 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        0.86 | [0.58, 1.27]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: Site) |    5.26e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_abundance_mod1_nb)
## [1] TRUE
#check the model
check_model(flower_abundance_mod1_nb, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_abundance_mod1_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.806
##           p-value = 0.984
## No overdispersion detected.
#collinearity
check_collinearity(flower_abundance_mod1_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF     VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  Days_since_start 2.20 [ 1.62,  3.31]         1.48      0.45     [0.30, 0.62]
## 
## High Correlation
## 
##                       Term   VIF     VIF 95% CI Increased SE Tolerance
##                  Site_type 25.67 [16.37, 40.58]         5.07      0.04
##       average_flower_cover 46.10 [29.23, 73.06]         6.79      0.02
##  Floral_simpson_index_site 11.18 [ 7.26, 17.55]         3.34      0.09
##           dm_wind_velocity 17.71 [11.37, 27.93]         4.21      0.06
##             dm_temperature 39.98 [25.38, 63.33]         6.32      0.03
##  Tolerance 95% CI
##      [0.02, 0.06]
##      [0.01, 0.03]
##      [0.06, 0.14]
##      [0.04, 0.09]
##      [0.02, 0.04]
# dharma package - simulate residuals and check model assumptions
flower_abundance_mod1_nb_sim_res <- simulateResiduals(fittedModel = flower_abundance_mod1_nb)
plot(flower_abundance_mod1_nb_sim_res)

#remove site type (p= 0.917  for flower_abundance_mod1_nb)
flower_abundance_mod2_nb <- glmer.nb(count 
                                     #~ Site_type
                                     ~ average_flower_cover
                                     #+ flower_sp
                                     + Floral_simpson_index_site
                                     + Days_since_start
                                     + dm_wind_velocity 
                                     + dm_temperature 
                                     + (1 | Site), 
                                   #negative binomial distribution model
                                   family = nbinom2,
                                   data = flower_cam_count_full)
## boundary (singular) fit: see help('isSingular')
summary(flower_abundance_mod2_nb)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(0.8592)  ( log )
## Formula: 
## count ~ average_flower_cover + Floral_simpson_index_site + Days_since_start +  
##     dm_wind_velocity + dm_temperature + (1 | Site)
##    Data: flower_cam_count_full
## 
##      AIC      BIC   logLik deviance df.resid 
##    525.0    538.5   -254.5    509.0       32 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.9131 -0.7385 -0.3616  0.4881  2.3015 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  Site   (Intercept) 5.903e-11 7.683e-06
## Number of obs: 40, groups:  Site, 9
## 
## Fixed effects:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 5.3668     0.1711  31.364   <2e-16 ***
## average_flower_cover       -0.9805     0.3759  -2.608   0.0091 ** 
## Floral_simpson_index_site  -0.4888     0.3939  -1.241   0.2147    
## Days_since_start            0.2317     0.2066   1.122   0.2620    
## dm_wind_velocity           -0.5661     0.2751  -2.058   0.0396 *  
## dm_temperature             -0.4959     0.2681  -1.849   0.0644 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) avrg__ Flr___ Dys_s_ dm_wn_
## avrg_flwr_c  0.003                            
## Flrl_smps__  0.001  0.814                     
## Dys_snc_str -0.001  0.152  0.091              
## dm_wnd_vlct  0.001 -0.007  0.250 -0.575       
## dm_tempertr  0.001 -0.218  0.147 -0.296  0.609
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
parameters(flower_abundance_mod2_nb)
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     5.37 | 0.17 | [ 5.03,  5.70] | 31.36 | < .001
## average flower cover      |    -0.98 | 0.38 | [-1.72, -0.24] | -2.61 | 0.009 
## Floral simpson index site |    -0.49 | 0.39 | [-1.26,  0.28] | -1.24 | 0.215 
## Days since start          |     0.23 | 0.21 | [-0.17,  0.64] |  1.12 | 0.262 
## dm wind velocity          |    -0.57 | 0.28 | [-1.11, -0.03] | -2.06 | 0.040 
## dm temperature            |    -0.50 | 0.27 | [-1.02,  0.03] | -1.85 | 0.064 
## 
## # Random Effects
## 
## Parameter            | Coefficient
## ----------------------------------
## SD (Intercept: Site) |    7.68e-06
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_abundance_mod2_nb)
## [1] TRUE
#check the model
check_model(flower_abundance_mod2_nb, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_abundance_mod2_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.764
##           p-value = 0.992
## No overdispersion detected.
#collinearity
check_collinearity(flower_abundance_mod2_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 4.59 [3.04, 7.29]         2.14      0.22
##  Floral_simpson_index_site 4.75 [3.14, 7.56]         2.18      0.21
##           Days_since_start 1.67 [1.28, 2.58]         1.29      0.60
##           dm_wind_velocity 2.47 [1.76, 3.84]         1.57      0.41
##             dm_temperature 2.10 [1.54, 3.25]         1.45      0.48
##  Tolerance 95% CI
##      [0.14, 0.33]
##      [0.13, 0.32]
##      [0.39, 0.78]
##      [0.26, 0.57]
##      [0.31, 0.65]
# dharma package - simulate residuals and check model assumptions
flower_abundance_mod2_nb_sim_res <- simulateResiduals(fittedModel = flower_abundance_mod2_nb)
plot(flower_abundance_mod2_nb_sim_res)

report(flower_abundance_mod2_nb)
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## We fitted a negative-binomial mixed model (estimated using ML and Nelder-Mead
## optimizer) to predict count with average_flower_cover,
## Floral_simpson_index_site, Days_since_start, dm_wind_velocity and
## dm_temperature (formula: count ~ average_flower_cover +
## Floral_simpson_index_site + Days_since_start + dm_wind_velocity +
## dm_temperature). The model included Site as random effect (formula: ~1 | Site).
## The model's intercept, corresponding to average_flower_cover = 0,
## Floral_simpson_index_site = 0, Days_since_start = 0, dm_wind_velocity = 0 and
## dm_temperature = 0, is at 5.37 (95% CI [5.03, 5.70], p < .001). Within this
## model:
## 
##   - The effect of average flower cover is statistically significant and negative
## (beta = -0.98, 95% CI [-1.72, -0.24], p = 0.009; Std. beta = -0.98, 95% CI
## [-1.72, -0.24])
##   - The effect of Floral simpson index site is statistically non-significant and
## negative (beta = -0.49, 95% CI [-1.26, 0.28], p = 0.215; Std. beta = -0.49, 95%
## CI [-1.26, 0.28])
##   - The effect of Days since start is statistically non-significant and positive
## (beta = 0.23, 95% CI [-0.17, 0.64], p = 0.262; Std. beta = 0.23, 95% CI [-0.17,
## 0.64])
##   - The effect of dm wind velocity is statistically significant and negative
## (beta = -0.57, 95% CI [-1.11, -0.03], p = 0.040; Std. beta = -0.57, 95% CI
## [-1.11, -0.03])
##   - The effect of dm temperature is statistically non-significant and negative
## (beta = -0.50, 95% CI [-1.02, 0.03], p = 0.064; Std. beta = -0.50, 95% CI
## [-1.02, 0.03])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation.
#remove days since start (p= 0.262   for flower_abundance_mod2_nb)
flower_abundance_mod3_nb <- glmmTMB(count 
                                     #~ Site_type
                                     #+ flower_sp
                                     ~ average_flower_cover
                                     + Floral_simpson_index_site
                                     #+ Days_since_start
                                     + dm_wind_velocity 
                                     + dm_temperature 
                                     + (1 | Site), 
                                   #negative binomial distribution model
                                   family = nbinom2,
                                   data = flower_cam_count_full)

summary(flower_abundance_mod3_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ average_flower_cover + Floral_simpson_index_site + dm_wind_velocity +  
##     dm_temperature + (1 | Site)
## Data: flower_cam_count_full
## 
##      AIC      BIC   logLik deviance df.resid 
##    524.3    536.1   -255.1    510.3       33 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  Site   (Intercept) 3.313e-09 5.756e-05
## Number of obs: 40, groups:  Site, 9
## 
## Dispersion parameter for nbinom2 family (): 0.839 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 5.3853     0.1732  31.102  < 2e-16 ***
## average_flower_cover       -1.0631     0.3752  -2.833  0.00461 ** 
## Floral_simpson_index_site  -0.5386     0.3917  -1.375  0.16915    
## dm_wind_velocity           -0.3864     0.2139  -1.806  0.07084 .  
## dm_temperature             -0.3992     0.2670  -1.495  0.13488    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_abundance_mod3_nb)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     5.39 | 0.17 | [ 5.05,  5.72] | 31.10 | < .001
## average flower cover      |    -1.06 | 0.38 | [-1.80, -0.33] | -2.83 | 0.005 
## Floral simpson index site |    -0.54 | 0.39 | [-1.31,  0.23] | -1.37 | 0.169 
## dm wind velocity          |    -0.39 | 0.21 | [-0.81,  0.03] | -1.81 | 0.071 
## dm temperature            |    -0.40 | 0.27 | [-0.92,  0.12] | -1.50 | 0.135 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        0.84 | [0.57, 1.23]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: Site) |    5.76e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_abundance_mod3_nb)
## [1] TRUE
#check the model
check_model(flower_abundance_mod3_nb, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_abundance_mod3_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.801
##           p-value =  0.92
## No overdispersion detected.
#collinearity
check_collinearity(flower_abundance_mod3_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 4.00 [2.69, 6.33]         2.00      0.25
##  Floral_simpson_index_site 4.08 [2.73, 6.46]         2.02      0.25
##           dm_wind_velocity 1.47 [1.17, 2.30]         1.21      0.68
##             dm_temperature 1.76 [1.34, 2.72]         1.33      0.57
##  Tolerance 95% CI
##      [0.16, 0.37]
##      [0.15, 0.37]
##      [0.43, 0.85]
##      [0.37, 0.75]
# dharma package - simulate residuals and check model assumptions
flower_abundance_mod3_nb_sim_res <- simulateResiduals(fittedModel = flower_abundance_mod3_nb)
plot(flower_abundance_mod3_nb_sim_res)

#remove floral simpson index (p= 0.169       for flower_abundance_mod3_nb)
flower_abundance_mod4_nb <- glmmTMB(count 
                                     #~ Site_type
                                     #+ flower_sp
                                     ~ average_flower_cover
                                     # Floral_simpson_index_site
                                     #+ Days_since_start
                                     + dm_wind_velocity 
                                     + dm_temperature 
                                     + (1 | Site), 
                                   #negative binomial distribution model
                                   family = nbinom2,
                                   data = flower_cam_count_full)
summary(flower_abundance_mod4_nb)
##  Family: nbinom2  ( log )
## Formula:          
## count ~ average_flower_cover + dm_wind_velocity + dm_temperature +  
##     (1 | Site)
## Data: flower_cam_count_full
## 
##      AIC      BIC   logLik deviance df.resid 
##    524.0    534.1   -256.0    512.0       34 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  Site   (Intercept) 4.228e-09 6.502e-05
## Number of obs: 40, groups:  Site, 9
## 
## Dispersion parameter for nbinom2 family (): 0.812 
## 
## Conditional model:
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            5.4111     0.1760  30.750  < 2e-16 ***
## average_flower_cover  -0.6716     0.2407  -2.790  0.00527 ** 
## dm_wind_velocity      -0.2983     0.2102  -1.419  0.15576    
## dm_temperature        -0.3277     0.2645  -1.239  0.21538    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_abundance_mod4_nb)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter            | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------
## (Intercept)          |     5.41 | 0.18 | [ 5.07,  5.76] | 30.75 | < .001
## average flower cover |    -0.67 | 0.24 | [-1.14, -0.20] | -2.79 | 0.005 
## dm wind velocity     |    -0.30 | 0.21 | [-0.71,  0.11] | -1.42 | 0.156 
## dm temperature       |    -0.33 | 0.26 | [-0.85,  0.19] | -1.24 | 0.215 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        0.81 | [0.55, 1.19]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: Site) |    6.50e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_abundance_mod4_nb)
## [1] TRUE
#check the model
check_model(flower_abundance_mod4_nb, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_abundance_mod4_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.812
##           p-value = 0.992
## No overdispersion detected.
#collinearity
check_collinearity(flower_abundance_mod4_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                  Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  average_flower_cover 1.57 [1.22, 2.48]         1.25      0.64     [0.40, 0.82]
##      dm_wind_velocity 1.34 [1.10, 2.20]         1.16      0.75     [0.46, 0.91]
##        dm_temperature 1.79 [1.35, 2.81]         1.34      0.56     [0.36, 0.74]
# dharma package - simulate residuals and check model assumptions
flower_abundance_mod4_nb_sim_res <- simulateResiduals(fittedModel = flower_abundance_mod4_nb)
plot(flower_abundance_mod4_nb_sim_res)

#remove dm temperature (p= 0.215        for flower_abundance_mod4_nb)
flower_abundance_mod5_nb <- glmmTMB(count 
                                     #~ Site_type
                                     #+ flower_sp
                                     ~ average_flower_cover
                                     #+ Floral_simpson_index_site
                                     #+ Days_since_start
                                     + dm_wind_velocity 
                                     #+ dm_temperature 
                                     + (1 | Site), 
                                   #negative binomial distribution model
                                   family = nbinom2,
                                   data = flower_cam_count_full)
summary(flower_abundance_mod5_nb)
##  Family: nbinom2  ( log )
## Formula:          count ~ average_flower_cover + dm_wind_velocity + (1 | Site)
## Data: flower_cam_count_full
## 
##      AIC      BIC   logLik deviance df.resid 
##    523.5    531.9   -256.8    513.5       35 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  Site   (Intercept) 7.655e-09 8.749e-05
## Number of obs: 40, groups:  Site, 9
## 
## Dispersion parameter for nbinom2 family (): 0.788 
## 
## Conditional model:
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            5.4352     0.1785  30.442  < 2e-16 ***
## average_flower_cover  -0.8567     0.1959  -4.373 1.23e-05 ***
## dm_wind_velocity      -0.1574     0.1773  -0.888    0.374    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_abundance_mod5_nb)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter            | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------
## (Intercept)          |     5.44 | 0.18 | [ 5.09,  5.79] | 30.44 | < .001
## average flower cover |    -0.86 | 0.20 | [-1.24, -0.47] | -4.37 | < .001
## dm wind velocity     |    -0.16 | 0.18 | [-0.50,  0.19] | -0.89 | 0.374 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        0.79 | [0.54, 1.16]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: Site) |    8.75e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_abundance_mod5_nb)
## [1] TRUE
#check the model
check_model(flower_abundance_mod5_nb, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_abundance_mod5_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.795
##           p-value = 0.968
## No overdispersion detected.
#collinearity
check_collinearity(flower_abundance_mod5_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                  Term  VIF  VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##  average_flower_cover 1.00 [1.00, Inf]         1.00      1.00     [0.00, 1.00]
##      dm_wind_velocity 1.00 [1.00, Inf]         1.00      1.00     [0.00, 1.00]
# dharma package - simulate residuals and check model assumptions
flower_abundance_mod5_nb_sim_res <- simulateResiduals(fittedModel = flower_abundance_mod5_nb)
plot(flower_abundance_mod5_nb_sim_res)

IV.D.1.a. Compare the models with the performance package

# Compare the models with the performance package
flower_abundance_nb_comp1 <- compare_performance(flower_abundance_mod1_nb, flower_abundance_mod2_nb, flower_abundance_mod3_nb, flower_abundance_mod4_nb, flower_abundance_mod5_nb,
                                                 metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(flower_abundance_nb_comp1)
## # Comparison of Model Performance Indices
## 
## Name                     |    Model | AICc (weights) | BIC (weights) |    RMSE
## ------------------------------------------------------------------------------
## flower_abundance_mod1_nb |  glmmTMB |  533.0 (0.011) | 542.2 (0.004) | 334.341
## flower_abundance_mod2_nb | glmerMod |  529.7 (0.057) | 538.5 (0.025) | 336.254
## flower_abundance_mod3_nb |  glmmTMB |  527.8 (0.146) | 536.1 (0.083) | 341.491
## flower_abundance_mod4_nb |  glmmTMB |  526.5 (0.274) | 534.1 (0.225) | 343.547
## flower_abundance_mod5_nb |  glmmTMB |  525.3 (0.512) | 531.9 (0.663) | 360.918

IV.D.1.b. Visualize the model results

#plot_model(flower_abundance_mod1_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_abundance_mod2_nb , type = "est", show.values = TRUE, value.offset = .3)

(est_flowcam <- plot_model(flower_abundance_mod5_nb, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c(#"Temperature", 
                           "Wind Velocity (km/h)",
                           #"Days since start",
                           #"Floral Simpson Index",
                           "Average Flower Cover %"
                           )) +
    labs(title = "Flower Camera: Activity of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0)) ) # 0 = left, 1 = right

#plot_model(flower_abundance_mod3_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_abundance_mod4_nb , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_abundance_mod5_nb , type = "est", show.values = TRUE, value.offset = .3)
## wind flower_abundance_mod2_nb---------
# Get the original mean and SD of wind velocity before scaling
wind_mean <- mean(envir_data$dm_wind_velocity, na.rm = TRUE)
wind_sd <- sd(envir_data$dm_wind_velocity, na.rm = TRUE)
# Get predictions on the scaled variable
pred_wind <- ggpredict(flower_abundance_mod5_nb , terms = "dm_wind_velocity")
# Unscale the x-axis
pred_wind$x_unscaled <- (pred_wind$x * wind_sd) + wind_mean
# Plot
(wind_count_flowercam <-ggplot(pred_wind, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["dm_wind_velocity"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["dm_wind_velocity"]], 0.5)) +
  labs(
    title = "Flower camera: Predicted Insect Abundance by Wind Velocity",
    x = "Wind Velocity (km/h)",
    y = "Predicted Insect Abundance"
  ))

## average_flower_cover flower_abundance_mod5_nb---------
#plot the predictions for the average_flower_cover, WITHOUT UNSCALING
# Get the original mean and SD of average flower cover before scaling
flower_cover_mean <- mean(planty$average_flower_cover, na.rm = TRUE)
flower_cover_sd <- sd(planty$average_flower_cover, na.rm = TRUE)
# Get predictions on the scaled variable
pred_avg_flower_cover <- ggpredict(flower_abundance_mod5_nb , terms = "average_flower_cover")
# Unscale the x-axis
pred_avg_flower_cover$x_unscaled <- (pred_avg_flower_cover$x * flower_cover_sd) + flower_cover_mean

# Plot
(avg_flowcam <- ggplot(pred_avg_flower_cover, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["average_flower_cover"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["average_flower_cover"]], 0.5)) +
  labs(
    title = "Flower camera: Predicted Insect Activity by Average Flower Cover",
    x = "Average Flower Cover (%)",
    y = "Predicted Insect Abundance"
  ))

#plot all wind_count_net, wind_count_flowercam and wind_count_platform in one plot
library(cowplot)

wind_count_net1 <- wind_count_net + theme(plot.title = element_text(size = 16))+ labs(title = "Transect walk: Predicted Interaction \nCounts by Wind Velocity") 
wind_count_flowercam1 <- wind_count_flowercam + theme(plot.title = element_text(size = 16)) +labs(title = "Flower cameras: Predicted Insect \nAbundance by Wind Velocity") 
wind_count_platform1 <- wind_count_platform + theme(plot.title = element_text(size = 16))+ labs(title = "Platform cameras: Predicted Insect \nCount by Wind Velocity") 

(cowplot::plot_grid(wind_count_net1, wind_count_flowercam1, wind_count_platform1, ncol = 3, labels = c("A", "B", "C")) +
  theme(plot.title = element_text(hjust = 0.5))+
  # font size of titles
  theme(plot.title = element_text(size = 16)))

#save the plot in figures
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/wind_velocity.png", 
       width = 14, height = 6, dpi = 600)

IV.D.1.c. Interpretation of the model results

#remove all objects starting with flower_abundance_
rm(list = ls(pattern = "^flower_abundance_"))

IV.D.2. FLOWER CAMERA FULL Richness - NB

#count all unique families per site
flower_richness <- flower_camera_modelling %>%
  #count unique families per site and flower_sp
  group_by(site, date , cam, flower_sp) %>%
  summarise(family_count = n_distinct(Family), .groups = 'drop') %>%
  #join the plant survey data
  left_join(planty, by = c("site"= "Site"))%>%
  #join environmental data
  left_join(envir_data, by = c("site"= "Site","date"= "Date", "average_flower_cover"))%>%
  #remove Plot_Cover_T
  dplyr::select(-c(Plot_Cover_T, Transect,minutes_since_9am,majority_class,grass, snh, forest, urban, agri, water, majority_class, Pastinaca.sativa, Daucus.carota, Floral_simpson_index_T, top2_ratio))%>%
  #scale the environmental data
  mutate(across(c(dm_wind_velocity, dm_temperature, average_flower_cover,Days_since_start,average_flower_cover, Floral_simpson_index_site), scale))%>%
  distinct()
## Warning in left_join(., envir_data, by = c(site = "Site", date = "Date", : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 41 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
#histogram of the number of families per flower camera
flower_richness %>%
  ggplot(aes(x = family_count)) +
  geom_histogram(binwidth=1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Number of Families per Flower Camera",
       x = "Number of Families",
       y = "Count")

#testing normality of the number of families per flower camera
shapiro.test(flower_richness$family_count) # p-value = 0.0007664, number of families per flower camera is not normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  flower_richness$family_count
## W = 0.88599, p-value = 0.0007664
datawizard::describe_distribution(flower_richness$family_count)
##  Mean |    SD | IQR |         Range | Skewness | Kurtosis |  n | n_Missing
## --------------------------------------------------------------------------
## 19.23 | 15.20 |  24 | [2.00, 57.00] |     0.90 |    -0.11 | 40 |         0

negative binomial distribution

# full model with insect abundance as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect, and recording time is included to account for sampling effort differences
flower_richness_mod1_nb <- glmmTMB(family_count 
                                    ~ average_flower_cover 
                                    + Floral_simpson_index_site 
                                    + Days_since_start  
                                    + dm_wind_velocity  
                                    + dm_temperature  
                                    + (1 | site),
                                    family = nbinom2,
                                    data = flower_richness)

summary(flower_richness_mod1_nb)
##  Family: nbinom2  ( log )
## Formula:          
## family_count ~ average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + dm_temperature + (1 |      site)
## Data: flower_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    319.5    333.0   -151.8    303.5       32 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  site   (Intercept) 1.779e-09 4.218e-05
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for nbinom2 family (): 2.21 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 2.8859     0.1133  25.477   <2e-16 ***
## average_flower_cover       -0.5103     0.2399  -2.127   0.0334 *  
## Floral_simpson_index_site  -0.3774     0.2425  -1.556   0.1196    
## Days_since_start            0.2017     0.1344   1.500   0.1336    
## dm_wind_velocity           -0.3063     0.1772  -1.729   0.0838 .  
## dm_temperature             -0.1590     0.1760  -0.903   0.3665    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_richness_mod1_nb)
## Your model may suffer from singularity (see `?lme4::isSingular` and
##   `?performance::check_singularity`).
##   Some of the confidence intervals of the random effects parameters are
##   probably not meaningful!
##   You may try to impose a prior on the random effects parameters, e.g.
##   using the glmmTMB package.
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     2.89 | 0.11 | [ 2.66,  3.11] | 25.48 | < .001
## average flower cover      |    -0.51 | 0.24 | [-0.98, -0.04] | -2.13 | 0.033 
## Floral simpson index site |    -0.38 | 0.24 | [-0.85,  0.10] | -1.56 | 0.120 
## Days since start          |     0.20 | 0.13 | [-0.06,  0.47] |  1.50 | 0.134 
## dm wind velocity          |    -0.31 | 0.18 | [-0.65,  0.04] | -1.73 | 0.084 
## dm temperature            |    -0.16 | 0.18 | [-0.50,  0.19] | -0.90 | 0.367 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        2.21 | [1.37, 3.55]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |      95% CI
## ------------------------------------------------
## SD (Intercept: site) |    4.22e-05 | [0.00, Inf]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check for singularity
performance::check_singularity(flower_richness_mod1_nb)
## [1] TRUE
#check the model
check_model(flower_richness_mod1_nb, verbose = T)
## Homogeneity of variance could not be computed. Cannot extract residual
##   variance from objects of class 'glmmTMB'.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_richness_mod1_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.909
##           p-value = 0.976
## No overdispersion detected.
#collinearity
check_collinearity(flower_richness_mod1_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 4.12 [2.79, 6.43]         2.03      0.24
##  Floral_simpson_index_site 4.25 [2.87, 6.64]         2.06      0.24
##           Days_since_start 1.53 [1.21, 2.34]         1.24      0.65
##           dm_wind_velocity 2.39 [1.73, 3.67]         1.55      0.42
##             dm_temperature 2.20 [1.61, 3.37]         1.48      0.45
##  Tolerance 95% CI
##      [0.16, 0.36]
##      [0.15, 0.35]
##      [0.43, 0.83]
##      [0.27, 0.58]
##      [0.30, 0.62]
# dharma package - simulate residuals and check model assumptions
flower_richness_mod1_nb_sim_res <- simulateResiduals(fittedModel = flower_richness_mod1_nb)
plot(flower_richness_mod1_nb_sim_res)

#remove days since start (p= 0.134  for flower_richness_mod1_nb)
flower_richness_mod2_nb <- glmmTMB(family_count 
                                    ~ average_flower_cover 
                                    + Floral_simpson_index_site 
                                    #+ Days_since_start  
                                    + dm_wind_velocity  
                                    + dm_temperature  
                                    + (1 | site),
                                    family = nbinom2,
                                    data = flower_richness)
summary(flower_richness_mod2_nb)
##  Family: nbinom2  ( log )
## Formula:          
## family_count ~ average_flower_cover + Floral_simpson_index_site +  
##     dm_wind_velocity + dm_temperature + (1 | site)
## Data: flower_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    319.6    331.4   -152.8    305.6       33 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.02121  0.1456  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for nbinom2 family (): 2.18 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                2.89067    0.12595  22.951   <2e-16 ***
## average_flower_cover      -0.57411    0.24992  -2.297   0.0216 *  
## Floral_simpson_index_site -0.41215    0.25340  -1.627   0.1038    
## dm_wind_velocity          -0.15800    0.16402  -0.963   0.3354    
## dm_temperature            -0.07971    0.18962  -0.420   0.6742    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_richness_mod2_nb)
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     2.89 | 0.13 | [ 2.64,  3.14] | 22.95 | < .001
## average flower cover      |    -0.57 | 0.25 | [-1.06, -0.08] | -2.30 | 0.022 
## Floral simpson index site |    -0.41 | 0.25 | [-0.91,  0.08] | -1.63 | 0.104 
## dm wind velocity          |    -0.16 | 0.16 | [-0.48,  0.16] | -0.96 | 0.335 
## dm temperature            |    -0.08 | 0.19 | [-0.45,  0.29] | -0.42 | 0.674 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        2.18 | [1.29, 3.66]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.15 | [0.01, 2.38]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_richness_mod2_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_richness_mod2_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.955
##           p-value =  0.84
## No overdispersion detected.
#collinearity
check_collinearity(flower_richness_mod2_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.62 [2.45, 5.71]         1.90      0.28
##  Floral_simpson_index_site 3.75 [2.53, 5.92]         1.94      0.27
##           dm_wind_velocity 1.69 [1.29, 2.61]         1.30      0.59
##             dm_temperature 2.03 [1.50, 3.15]         1.43      0.49
##  Tolerance 95% CI
##      [0.18, 0.41]
##      [0.17, 0.39]
##      [0.38, 0.77]
##      [0.32, 0.67]
# dharma package - simulate residuals and check model assumptions
flower_richness_mod2_nb_sim_res <- simulateResiduals(fittedModel = flower_richness_mod2_nb)
plot(flower_richness_mod2_nb_sim_res)

#remove dm temperature (p= 0.6742      for flower_richness_mod2_nb)
flower_richness_mod3_nb <- glmmTMB(family_count 
                                    ~ average_flower_cover 
                                    + Floral_simpson_index_site 
                                    #+ Days_since_start  
                                    + dm_wind_velocity  
                                    #+ dm_temperature  
                                    + (1 | site),
                                    family = nbinom2,
                                    data = flower_richness)

summary(flower_richness_mod3_nb)
##  Family: nbinom2  ( log )
## Formula:          
## family_count ~ average_flower_cover + Floral_simpson_index_site +  
##     dm_wind_velocity + (1 | site)
## Data: flower_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    317.8    327.9   -152.9    305.8       34 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.02405  0.1551  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for nbinom2 family (): 2.18 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 2.8906     0.1273  22.704   <2e-16 ***
## average_flower_cover       -0.5988     0.2455  -2.438   0.0147 *  
## Floral_simpson_index_site  -0.3901     0.2516  -1.550   0.1211    
## dm_wind_velocity           -0.1165     0.1343  -0.868   0.3855    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_richness_mod3_nb)
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     2.89 | 0.13 | [ 2.64,  3.14] | 22.70 | < .001
## average flower cover      |    -0.60 | 0.25 | [-1.08, -0.12] | -2.44 | 0.015 
## Floral simpson index site |    -0.39 | 0.25 | [-0.88,  0.10] | -1.55 | 0.121 
## dm wind velocity          |    -0.12 | 0.13 | [-0.38,  0.15] | -0.87 | 0.386 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        2.18 | [1.29, 3.66]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.16 | [0.01, 1.88]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_richness_mod3_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_richness_mod3_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.978
##           p-value =  0.88
## No overdispersion detected.
#collinearity
check_collinearity(flower_richness_mod3_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.42 [2.31, 5.47]         1.85      0.29
##  Floral_simpson_index_site 3.59 [2.41, 5.74]         1.89      0.28
##           dm_wind_velocity 1.11 [1.01, 2.91]         1.05      0.90
##  Tolerance 95% CI
##      [0.18, 0.43]
##      [0.17, 0.41]
##      [0.34, 0.99]
# dharma package - simulate residuals and check model assumptions
flower_richness_mod3_nb_sim_res <- simulateResiduals(fittedModel = flower_richness_mod3_nb)
plot(flower_richness_mod3_nb_sim_res)

#remove wind velocity (p= 0.386   for flower_richness_mod3_nb)
flower_richness_mod4_nb <- glmmTMB(family_count 
                                    ~ average_flower_cover 
                                    + Floral_simpson_index_site 
                                    #+ Days_since_start  
                                    #+ dm_wind_velocity  
                                    #+ dm_temperature  
                                    + (1 | site),
                                    family = nbinom2,
                                    data = flower_richness)
summary(flower_richness_mod4_nb)
##  Family: nbinom2  ( log )
## Formula:          
## family_count ~ average_flower_cover + Floral_simpson_index_site +  
##     (1 | site)
## Data: flower_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    316.4    324.9   -153.2    306.4       35 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.03786  0.1946  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for nbinom2 family (): 2.18 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 2.8912     0.1328  21.776   <2e-16 ***
## average_flower_cover       -0.5547     0.2452  -2.262   0.0237 *  
## Floral_simpson_index_site  -0.3271     0.2452  -1.334   0.1822    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_richness_mod4_nb)
## # Fixed Effects
## 
## Parameter                 | Log-Mean |   SE |         95% CI |     z |      p
## -----------------------------------------------------------------------------
## (Intercept)               |     2.89 | 0.13 | [ 2.63,  3.15] | 21.78 | < .001
## average flower cover      |    -0.55 | 0.25 | [-1.04, -0.07] | -2.26 | 0.024 
## Floral simpson index site |    -0.33 | 0.25 | [-0.81,  0.15] | -1.33 | 0.182 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        2.18 | [1.30, 3.67]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.19 | [0.04, 1.02]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_richness_mod4_nb, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_richness_mod4_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.932
##           p-value = 0.968
## No overdispersion detected.
#collinearity
check_collinearity(flower_richness_mod4_nb)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.20 [2.16, 5.18]         1.79      0.31
##  Floral_simpson_index_site 3.20 [2.16, 5.18]         1.79      0.31
##  Tolerance 95% CI
##      [0.19, 0.46]
##      [0.19, 0.46]
# dharma package - simulate residuals and check model assumptions
flower_richness_mod4_nb_sim_res <- simulateResiduals(fittedModel = flower_richness_mod4_nb)
plot(flower_richness_mod4_nb_sim_res)

#remove floral simpson index (p= 0.169       for flower_richness_mod4_nb)
flower_richness_mod5_nb <- glmmTMB(family_count 
                                    ~ average_flower_cover 
                                    # Floral_simpson_index_site 
                                    #+ Days_since_start  
                                    #+ dm_wind_velocity  
                                    #+ dm_temperature  
                                    + (1 | site),
                                    family = nbinom2,
                                    data = flower_richness)
summary(flower_richness_mod5_nb)
##  Family: nbinom2  ( log )
## Formula:          family_count ~ average_flower_cover + (1 | site)
## Data: flower_richness
## 
##      AIC      BIC   logLik deviance df.resid 
##    316.1    322.9   -154.1    308.1       36 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.04925  0.2219  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for nbinom2 family (): 2.13 
## 
## Conditional model:
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            2.8975     0.1394  20.779   <2e-16 ***
## average_flower_cover  -0.2837     0.1427  -1.989   0.0467 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_richness_mod5_nb)
## # Fixed Effects
## 
## Parameter            | Log-Mean |   SE |         95% CI |     z |      p
## ------------------------------------------------------------------------
## (Intercept)          |     2.90 | 0.14 | [ 2.62,  3.17] | 20.78 | < .001
## average flower cover |    -0.28 | 0.14 | [-0.56,  0.00] | -1.99 | 0.047 
## 
## # Dispersion
## 
## Parameter   | Coefficient |       95% CI
## ----------------------------------------
## (Intercept) |        2.13 | [1.26, 3.59]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.22 | [0.05, 1.02]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_richness_mod5_nb, verbose = T)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_richness_mod5_nb)
## # Overdispersion test
## 
##  dispersion ratio = 0.930
##           p-value = 0.888
## No overdispersion detected.
#collinearity
check_collinearity(flower_richness_mod5_nb)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
# dharma package - simulate residuals and check model assumptions
flower_richness_mod5_nb_sim_res <- simulateResiduals(fittedModel = flower_richness_mod5_nb)
plot(flower_richness_mod5_nb_sim_res)

report(flower_richness_mod5_nb)
## Warning in text == "" | text2 == "": longer object length is not a multiple of
## shorter object length
## Warning in text == "" | text2 == "": longer object length is not a multiple of
## shorter object length
## We fitted a negative-binomial mixed model (estimated using ML and nlminb
## optimizer) to predict family_count with average_flower_cover (formula:
## family_count ~ average_flower_cover). The model included site as random effect
## (formula: ~1 | site). The model's total explanatory power is moderate
## (conditional R2 = 0.24) and the part related to the fixed effects alone
## (marginal R2) is of 0.15. The model's intercept, corresponding to
## average_flower_cover = 0, is at 2.90 (95% CI [2.62, 3.17], p < .001). Within
## this model:
## 
##   - The intercept is statistically significant and negative (beta = -0.28, 95% CI
## [-0.56, -4.10e-03], p = 0.047; Std. beta = 2.90, 95% CI [2.62, 3.17])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation. and We fitted a
## negative-binomial mixed model (estimated using ML and nlminb optimizer) to
## predict family_count with average_flower_cover (formula: family_count ~
## average_flower_cover). The model included site as random effect (formula: ~1 |
## site). The model's total explanatory power is moderate (conditional R2 = 0.24)
## and the part related to the fixed effects alone (marginal R2) is of 0.15. The
## model's intercept, corresponding to average_flower_cover = 0, is at 2.13 (95%
## CI [1.26, 3.59]). Within this model:
## 
##   - The intercept is statistically significant and negative (beta = -0.28, 95% CI
## [-0.56, -4.10e-03], p = 0.047; Std. beta = 2.90, 95% CI [2.62, 3.17])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation.

IV.D.2.a. Compare the models with the performance package

# Compare the models with the performance package
flower_richness_nb_comp1 <- compare_performance(flower_richness_mod1_nb, flower_richness_mod2_nb, flower_richness_mod3_nb, flower_richness_mod4_nb, flower_richness_mod5_nb,
                                                 metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(flower_richness_nb_comp1)
## # Comparison of Model Performance Indices
## 
## Name                    |   Model | AICc (weights) | BIC (weights) |   RMSE
## ---------------------------------------------------------------------------
## flower_richness_mod1_nb | glmmTMB |  324.2 (0.016) | 333.0 (0.004) | 13.844
## flower_richness_mod2_nb | glmmTMB |  323.1 (0.028) | 331.4 (0.010) | 13.315
## flower_richness_mod3_nb | glmmTMB |  320.3 (0.114) | 327.9 (0.056) | 13.351
## flower_richness_mod4_nb | glmmTMB |  318.2 (0.323) | 324.9 (0.249) | 13.106
## flower_richness_mod5_nb | glmmTMB |  317.3 (0.519) | 322.9 (0.682) | 12.934
## 
## Name                    | R2 (cond.) | R2 (marg.) |   ICC
## ---------------------------------------------------------
## flower_richness_mod1_nb |            |            |      
## flower_richness_mod2_nb |      0.273 |      0.236 | 0.049
## flower_richness_mod3_nb |      0.275 |      0.233 | 0.055
## flower_richness_mod4_nb |      0.279 |      0.212 | 0.084
## flower_richness_mod5_nb |      0.238 |      0.149 | 0.105

IV.D.2.b. Visualize the model results

plot_model(flower_richness_mod1_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_richness_mod2_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_richness_mod3_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_richness_mod4_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_richness_mod5_nb , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_richness_mod1_nb, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c(
             "Temperature",
             "Wind Velocity (km/h)",
             "Days since start",
             "Floral Simpson Index",
             "Average Flower Cover %" )) +
    labs(title = "Flower Camera: Richness of Pollinators", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

## average flower cover flower_richness_mod5_nb---------
# get the original mean and SD of average flower cover before scaling
average_flower_cover_mean <- mean(planty$average_flower_cover, na.rm = TRUE)
average_flower_cover_sd <- sd(planty$average_flower_cover, na.rm = TRUE)
# Get predictions on the scaled variable
pred_avg_flower_cover <- ggpredict(flower_richness_mod5_nb , terms = "average_flower_cover")
# Unscale the x-axis
pred_avg_flower_cover$x_unscaled <- (pred_avg_flower_cover$x * average_flower_cover_sd) + average_flower_cover_mean

# Plot
(avg_flowcam_rich <- ggplot(pred_avg_flower_cover, aes(x = x_unscaled, y = predicted)) +
  #plot with predictor color
  geom_line(size = 1.2, color = predictor_colors[["average_flower_cover"]]) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), 
              fill = alpha(predictor_colors[["average_flower_cover"]], 0.5)) +
  labs(
    title = "Flower camera: Predicted Insect Richness by Average Flower Cover",
    x = "Average Flower Cover (%)",
    y = "Predicted Insect Richness"
  ))

#plot est_flowcam avg_flowcam avg_flowcam_rich together
library(cowplot)
#est_flowcam1 <- est_flowcam + theme(plot.title = element_text(size = 16))+ labs(title = "Predicted Insect Activity")
avg_flowcam1 <- avg_flowcam + theme(plot.title = element_text(size = 14))+ labs(title = "Predicted Insect Activity by Average Flower Cover")
avg_flowcam_rich1 <- avg_flowcam_rich + theme(plot.title = element_text(size = 14))+ labs(title = "Predicted Insect Richness by Average Flower Cover")

#combine plots
(combi <- cowplot::plot_grid(
  #est_flowcam1, 
  avg_flowcam1, 
  avg_flowcam_rich1, 
  ncol = 2, labels = c("A", "B", "C"), 
  #change width
  rel_widths = c(1, 1, 1)))

#add main title to combi
(final_plot <- cowplot::plot_grid(
  ggdraw() + draw_label(
    "Flower camera: Predicted Insect Activity and Richness", 
    fontface = 'bold', size = 16, x = 0.5, hjust = 0.5
  ),
  combi,
  ncol = 1,
  rel_heights = c(0.1, 1)  # Title height vs. plots height
))

#save the plot in figures
ggsave("C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/figures/flower_cam_abundance_richness.png",plot= final_plot,
       width = 14, height = 6, dpi = 600)

IV.D.2.c. Interpretation of the model results

#remove all objects starting with flower_richness_
rm(list = ls(pattern = "^flower_richness_"))

IV.D.3. FLOWER CAMERA FULL Shannon index - Gaussian distribution

#transform flower_camera_modelling into wide format to use vegan package to get the shannon and simpson index
flower_cam_diversity <- flower_camera_modelling%>%
  #create new column with counts = 1
  mutate(count = 1) %>%
  #wide format filled with counts, fill empty cells with 0
  pivot_wider(names_from = Family, values_from = count, values_fill = 0)%>%
  #combine rows that have the same site, flower and camera
  group_by(site, date , cam, flower_sp) %>%
  #sum the counts for each family
  summarise(across(Apidae:Chrysididae, sum), .groups = 'drop') %>%
  #calculate shannon index and simpson index
  mutate(shannon_diversity = diversity(across(Apidae:Chrysididae), index = "shannon"),
         simpson_diversity = diversity(across(Apidae:Chrysididae), index = "simpson"))%>%
  #keep only site, date, cam, flower_sp, shannon_diversity, simpson_diversity
 dplyr::select(site, date , cam, flower_sp, shannon_diversity, simpson_diversity)%>%
  #join the plant survey data
  left_join(planty, by = c("site"= "Site"))%>%
  #join environmental data
  left_join(envir_data, by = c("site"= "Site","date"= "Date", "average_flower_cover"))%>%
  #remove Plot_Cover_T, majority_class, grass, snh, forest, urban, agri, water, majority_class, Pastinaca.sativa, Daucus.carota, Floral_simpson_index_T, top2_ratio
 dplyr::select(-c(Plot_Cover_T, Transect, majority_class, grass, snh, forest, urban, agri, water, Pastinaca.sativa, Daucus.carota, Floral_simpson_index_T, top2_ratio, minutes_since_9am))  %>%
  #scale the environmental data
  mutate(across(c(dm_wind_velocity, dm_temperature, average_flower_cover,Days_since_start,average_flower_cover, Floral_simpson_index_site), scale))%>%
  #remove duplicates
  distinct()
## Warning in left_join(., envir_data, by = c(site = "Site", date = "Date", : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 41 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
#histogram of the shannon index
flower_cam_diversity %>%
  ggplot(aes(x = shannon_diversity)) +
  geom_histogram(binwidth=0.1, fill = "lightblue", color = "black") +
  labs(title = "Histogram of Shannon Diversity Index per Flower Camera",
       x = "Shannon Diversity Index",
       y = "Count")

#testing normality of the shannon index
shapiro.test(flower_cam_diversity$shannon_diversity) # p-value = 0.3392, shannon index is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  flower_cam_diversity$shannon_diversity
## W = 0.96919, p-value = 0.3392
datawizard::describe_distribution(flower_cam_diversity$shannon_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 1.67 | 0.65 | 0.98 | [0.56, 3.05] |     0.31 |    -0.70 | 40 |         0

Since the shannon index is normally distributed, we can use a linear model.

# full model with insect SHANNON INDEX as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# linear model
flower_cam_diversity_mod1_gauss <- lmer(shannon_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         + Days_since_start  
                                         + dm_wind_velocity  
                                         + dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity)

summary(flower_cam_diversity_mod1_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + dm_temperature + (1 |      site)
##    Data: flower_cam_diversity
## 
## REML criterion at convergence: 79
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2121 -0.6373 -0.1007  0.6416  2.1585 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.3128   0.5593  
##  Residual             0.2887   0.5373  
## Number of obs: 40, groups:  site, 9
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)                1.69063    0.20546   8.229
## average_flower_cover      -0.29354    0.38430  -0.764
## Floral_simpson_index_site -0.37912    0.38201  -0.992
## Days_since_start           0.15495    0.26112   0.593
## dm_wind_velocity          -0.27666    0.34614  -0.799
## dm_temperature            -0.06184    0.34123  -0.181
## 
## Correlation of Fixed Effects:
##             (Intr) avrg__ Flr___ Dys_s_ dm_wn_
## avrg_flwr_c -0.015                            
## Flrl_smps__ -0.021  0.726                     
## Dys_snc_str  0.027  0.259  0.143              
## dm_wnd_vlct -0.005 -0.124  0.264 -0.514       
## dm_tempertr -0.031 -0.299  0.194 -0.313  0.696
parameters(flower_cam_diversity_mod1_gauss)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI | t(32) |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        1.69 | 0.21 | [ 1.27, 2.11] |  8.23 | < .001
## average flower cover      |       -0.29 | 0.38 | [-1.08, 0.49] | -0.76 | 0.451 
## Floral simpson index site |       -0.38 | 0.38 | [-1.16, 0.40] | -0.99 | 0.328 
## Days since start          |        0.15 | 0.26 | [-0.38, 0.69] |  0.59 | 0.557 
## dm wind velocity          |       -0.28 | 0.35 | [-0.98, 0.43] | -0.80 | 0.430 
## dm temperature            |       -0.06 | 0.34 | [-0.76, 0.63] | -0.18 | 0.857 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.56 | 0.28 | [0.21, 1.48]
## SD (Residual)        |        0.54 | 0.07 | [0.42, 0.69]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(flower_cam_diversity_mod1_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_cam_diversity_mod1_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.624
##           p-value = 0.144
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_mod1_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.70 [2.51, 5.85]         1.92      0.27
##  Floral_simpson_index_site 3.75 [2.53, 5.92]         1.94      0.27
##           Days_since_start 1.54 [1.21, 2.40]         1.24      0.65
##           dm_wind_velocity 2.67 [1.88, 4.17]         1.63      0.37
##             dm_temperature 2.63 [1.85, 4.10]         1.62      0.38
##  Tolerance 95% CI
##      [0.17, 0.40]
##      [0.17, 0.39]
##      [0.42, 0.83]
##      [0.24, 0.53]
##      [0.24, 0.54]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_mod1_gauss_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_mod1_gauss)
plot(flower_cam_diversity_mod1_gauss_sim_res)
## qu = 0.75, log(sigma) = -2.851525 : outer Newton did not converge fully.

#remove dm temperature (p= 0.857       for flower_cam_diversity_mod1_gauss)
flower_cam_diversity_mod2_gauss <- lmer(shannon_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         + Days_since_start  
                                         + dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity)
summary(flower_cam_diversity_mod2_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + (1 | site)
##    Data: flower_cam_diversity
## 
## REML criterion at convergence: 78.6
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.21455 -0.62479 -0.09841  0.61968  2.15587 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.2210   0.4701  
##  Residual             0.2887   0.5373  
## Number of obs: 40, groups:  site, 9
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)                 1.6885     0.1787   9.450
## average_flower_cover       -0.3116     0.3193  -0.976
## Floral_simpson_index_site  -0.3633     0.3265  -1.113
## Days_since_start            0.1395     0.2155   0.647
## dm_wind_velocity           -0.2318     0.2157  -1.075
## 
## Correlation of Fixed Effects:
##             (Intr) avrg__ Flr___ Dys_s_
## avrg_flwr_c -0.023                     
## Flrl_smps__ -0.015  0.837              
## Dys_snc_str  0.017  0.183  0.220       
## dm_wnd_vlct  0.021  0.122  0.181 -0.435
parameters(flower_cam_diversity_mod2_gauss)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI | t(33) |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        1.69 | 0.18 | [ 1.32, 2.05] |  9.45 | < .001
## average flower cover      |       -0.31 | 0.32 | [-0.96, 0.34] | -0.98 | 0.336 
## Floral simpson index site |       -0.36 | 0.33 | [-1.03, 0.30] | -1.11 | 0.274 
## Days since start          |        0.14 | 0.22 | [-0.30, 0.58] |  0.65 | 0.522 
## dm wind velocity          |       -0.23 | 0.22 | [-0.67, 0.21] | -1.07 | 0.290 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.47 | 0.22 | [0.19, 1.16]
## SD (Residual)        |        0.54 | 0.07 | [0.42, 0.69]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(flower_cam_diversity_mod2_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_cam_diversity_mod2_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.731
##           p-value = 0.384
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_mod2_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.36 [2.28, 5.37]         1.83      0.30
##  Floral_simpson_index_site 3.60 [2.42, 5.76]         1.90      0.28
##           Days_since_start 1.39 [1.12, 2.25]         1.18      0.72
##           dm_wind_velocity 1.37 [1.11, 2.23]         1.17      0.73
##  Tolerance 95% CI
##      [0.19, 0.44]
##      [0.17, 0.41]
##      [0.45, 0.89]
##      [0.45, 0.90]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_mod2_gauss_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_mod2_gauss)
plot(flower_cam_diversity_mod2_gauss_sim_res)

#remove days since start (p= 0.522    for flower_cam_diversity_mod2_gauss)
flower_cam_diversity_mod3_gauss <- lmer(shannon_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         + dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity)
summary(flower_cam_diversity_mod3_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     dm_wind_velocity + (1 | site)
##    Data: flower_cam_diversity
## 
## REML criterion at convergence: 77.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.1183 -0.6618 -0.1283  0.6423  2.2503 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.1862   0.4315  
##  Residual             0.2889   0.5375  
## Number of obs: 40, groups:  site, 9
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)                 1.6860     0.1675  10.069
## average_flower_cover       -0.3479     0.2944  -1.181
## Floral_simpson_index_site  -0.4085     0.2988  -1.367
## dm_wind_velocity           -0.1705     0.1818  -0.938
## 
## Correlation of Fixed Effects:
##             (Intr) avrg__ Flr___
## avrg_flwr_c -0.026              
## Flrl_smps__ -0.018  0.831       
## dm_wnd_vlct  0.031  0.227  0.314
parameters(flower_cam_diversity_mod3_gauss)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI | t(34) |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        1.69 | 0.17 | [ 1.35, 2.03] | 10.07 | < .001
## average flower cover      |       -0.35 | 0.29 | [-0.95, 0.25] | -1.18 | 0.246 
## Floral simpson index site |       -0.41 | 0.30 | [-1.02, 0.20] | -1.37 | 0.181 
## dm wind velocity          |       -0.17 | 0.18 | [-0.54, 0.20] | -0.94 | 0.355 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.43 | 0.19 | [0.19, 1.00]
## SD (Residual)        |        0.54 | 0.07 | [0.42, 0.69]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(flower_cam_diversity_mod3_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_cam_diversity_mod3_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.808
##           p-value = 0.552
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_mod3_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.25 [2.18, 5.26]         1.80      0.31
##  Floral_simpson_index_site 3.42 [2.28, 5.54]         1.85      0.29
##           dm_wind_velocity 1.11 [1.01, 2.91]         1.06      0.90
##  Tolerance 95% CI
##      [0.19, 0.46]
##      [0.18, 0.44]
##      [0.34, 0.99]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_mod3_gauss_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_mod3_gauss)
plot(flower_cam_diversity_mod3_gauss_sim_res)

#remove dm wind velocity (p= 0.355        for flower_cam_diversity_mod3_gauss)
flower_cam_diversity_mod4_gauss <- lmer(shannon_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         #+ dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity)
summary(flower_cam_diversity_mod4_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## shannon_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     (1 | site)
##    Data: flower_cam_diversity
## 
## REML criterion at convergence: 77.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2147 -0.6197 -0.1601  0.6083  2.1501 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.1788   0.4229  
##  Residual             0.2894   0.5380  
## Number of obs: 40, groups:  site, 9
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)                 1.6906     0.1649  10.251
## average_flower_cover       -0.2848     0.2826  -1.008
## Floral_simpson_index_site  -0.3202     0.2796  -1.145
## 
## Correlation of Fixed Effects:
##             (Intr) avrg__
## avrg_flwr_c -0.033       
## Flrl_smps__ -0.029  0.822
parameters(flower_cam_diversity_mod4_gauss)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI | t(35) |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        1.69 | 0.16 | [ 1.36, 2.03] | 10.25 | < .001
## average flower cover      |       -0.28 | 0.28 | [-0.86, 0.29] | -1.01 | 0.321 
## Floral simpson index site |       -0.32 | 0.28 | [-0.89, 0.25] | -1.15 | 0.260 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.42 | 0.17 | [0.19, 0.92]
## SD (Residual)        |        0.54 | 0.07 | [0.42, 0.69]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(flower_cam_diversity_mod4_gauss, verbose = T)
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_cam_diversity_mod4_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.870
##           p-value =  0.76
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_mod4_gauss)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.08 [2.07, 5.05]         1.75      0.32
##  Floral_simpson_index_site 3.08 [2.07, 5.05]         1.75      0.32
##  Tolerance 95% CI
##      [0.20, 0.48]
##      [0.20, 0.48]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_mod4_gauss_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_mod4_gauss)
plot(flower_cam_diversity_mod4_gauss_sim_res)
## qu = 0.75, log(sigma) = -2.669422 : outer Newton did not converge fully.

#remove average flower cover (p= 0.321        for flower_cam_diversity_mod4_gauss)
flower_cam_diversity_mod5_gauss <- lmer(shannon_diversity 
                                         #~ average_flower_cover 
                                         ~ Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         #+ dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity)
summary(flower_cam_diversity_mod5_gauss)
## Linear mixed model fit by REML ['lmerMod']
## Formula: shannon_diversity ~ Floral_simpson_index_site + (1 | site)
##    Data: flower_cam_diversity
## 
## REML criterion at convergence: 77.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2117 -0.6531 -0.1092  0.6980  2.1468 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  site     (Intercept) 0.1761   0.4196  
##  Residual             0.2903   0.5388  
## Number of obs: 40, groups:  site, 9
## 
## Fixed effects:
##                           Estimate Std. Error t value
## (Intercept)                1.68502    0.16397   10.28
## Floral_simpson_index_site -0.08873    0.15855   -0.56
## 
## Correlation of Fixed Effects:
##             (Intr)
## Flrl_smps__ -0.003
parameters(flower_cam_diversity_mod5_gauss)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI | t(36) |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        1.69 | 0.16 | [ 1.35, 2.02] | 10.28 | < .001
## Floral simpson index site |       -0.09 | 0.16 | [-0.41, 0.23] | -0.56 | 0.579 
## 
## # Random Effects
## 
## Parameter            | Coefficient |   SE |       95% CI
## --------------------------------------------------------
## SD (Intercept: site) |        0.42 | 0.15 | [0.20, 0.86]
## SD (Residual)        |        0.54 | 0.07 | [0.42, 0.69]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald t-distribution approximation. Uncertainty intervals for
##   random effect variances computed using a Wald z-distribution
##   approximation.
#check the model
check_model(flower_cam_diversity_mod5_gauss, verbose = T)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.
## Some of the variables were in matrix-format - probably you used
##   `scale()` on your data?
##   If so, and you get an error, please try `datawizard::standardize()` to
##   standardize your data.

#overdispersion
check_overdispersion(flower_cam_diversity_mod5_gauss)
## # Overdispersion test
## 
##  dispersion ratio = 0.920
##           p-value = 0.904
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_mod5_gauss)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
# dharma package - simulate residuals and check model assumptions`
flower_cam_diversity_mod5_gauss_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_mod5_gauss)
plot(flower_cam_diversity_mod5_gauss_sim_res)

IV.D.3.a. Compare the models with the performance package

# Compare the models with the performance package
flower_cam_diversity_gauss_comp1 <- compare_performance(flower_cam_diversity_mod1_gauss, flower_cam_diversity_mod2_gauss, flower_cam_diversity_mod3_gauss, flower_cam_diversity_mod4_gauss, flower_cam_diversity_mod5_gauss,
                                                 metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(flower_cam_diversity_gauss_comp1)
## # Comparison of Model Performance Indices
## 
## Name                            |   Model | AICc (weights) | BIC (weights)
## --------------------------------------------------------------------------
## flower_cam_diversity_mod1_gauss | lmerMod |   93.8 (0.003) | 102.6 (<.001)
## flower_cam_diversity_mod2_gauss | lmerMod |   89.3 (0.024) |  97.6 (0.008)
## flower_cam_diversity_mod3_gauss | lmerMod |   86.3 (0.106) |  93.9 (0.049)
## flower_cam_diversity_mod4_gauss | lmerMod |   84.4 (0.273) |  91.1 (0.200)
## flower_cam_diversity_mod5_gauss | lmerMod |   82.9 (0.595) |  88.5 (0.743)
## 
## Name                            | R2 (cond.) | R2 (marg.) |   ICC |  RMSE
## -------------------------------------------------------------------------
## flower_cam_diversity_mod1_gauss |      0.576 |      0.117 | 0.520 | 0.477
## flower_cam_diversity_mod2_gauss |      0.509 |      0.133 | 0.434 | 0.480
## flower_cam_diversity_mod3_gauss |      0.463 |      0.117 | 0.392 | 0.483
## flower_cam_diversity_mod4_gauss |      0.426 |      0.071 | 0.382 | 0.486
## flower_cam_diversity_mod5_gauss |      0.388 |      0.017 | 0.378 | 0.489

IV.D.3.b. Visualize the model results

#plot_model(flower_cam_diversity_mod1_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_cam_diversity_mod2_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_cam_diversity_mod3_gauss , type = "est", show.values = TRUE, value.offset = .3)
#plot_model(flower_cam_diversity_mod4_gauss , type = "est", show.values = TRUE, value.offset = .3)
plot_model(flower_cam_diversity_mod5_gauss , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_mod1_gauss, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c(
             "Temperature",
             "Wind Velocity (km/h)",
             "Days since start",
             "Floral Simpson Index",
             "Average Flower Cover %" )) +
    labs(title = "Flower Camera: Shannon Diversity Index", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

No predictor seems to significantly affect the shannon index captured by flower cameras.

IV.D.3.c. Interpretation of the model results

IV.D.4. FLOWER CAMERA FULL Simpson index - Beta regression

# histogram of the simpson index
flower_cam_diversity %>%
  ggplot(aes(x = simpson_diversity)) +
  geom_histogram( fill = "lightblue", color = "black") +
  labs(title = "Histogram of Simpson Diversity Index per Flower Camera",
       x = "Simpson Diversity Index",
       y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#testing normality of the simpson index
shapiro.test(flower_cam_diversity$simpson_diversity) # p-value = 0.07352, simpson index is normally distributed
## 
##  Shapiro-Wilk normality test
## 
## data:  flower_cam_diversity$simpson_diversity
## W = 0.94961, p-value = 0.07352
datawizard::describe_distribution(flower_cam_diversity$simpson_diversity)
## Mean |   SD |  IQR |        Range | Skewness | Kurtosis |  n | n_Missing
## ------------------------------------------------------------------------
## 0.68 | 0.17 | 0.26 | [0.25, 0.92] |    -0.53 |    -0.56 | 40 |         0
summary(flower_cam_diversity$simpson_diversity)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.2548  0.5413  0.7127  0.6769  0.8025  0.9235

Like all the other simpson indices, the simpson index score is found between 0 and 1. Here, with a mean of 0.55 and a sd of 0.23. We will proceed with a beta regression model.

# full model with insect SIMPSON INDEX as response variable and environmental, weather and plant diversity variables as explanatory variables, and site as random effect
# beta regression model
flower_cam_diversity_simpson_mod1_beta <- glmmTMB(simpson_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         + Days_since_start  
                                         + dm_wind_velocity  
                                         + dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity, 
                                       family = beta_family())
summary(flower_cam_diversity_simpson_mod1_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + dm_temperature + (1 |      site)
## Data: flower_cam_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -28.0    -14.5     22.0    -44.0       32 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.03105  0.1762  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for beta family (): 9.72 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 0.7599     0.1206   6.302 2.94e-10 ***
## average_flower_cover       -0.2692     0.2310  -1.165   0.2438    
## Floral_simpson_index_site  -0.4081     0.2258  -1.808   0.0707 .  
## Days_since_start            0.1155     0.1504   0.768   0.4425    
## dm_wind_velocity           -0.2763     0.1988  -1.390   0.1647    
## dm_temperature              0.0145     0.1928   0.075   0.9400    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_cam_diversity_simpson_mod1_beta)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI |     z |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        0.76 | 0.12 | [ 0.52, 1.00] |  6.30 | < .001
## average flower cover      |       -0.27 | 0.23 | [-0.72, 0.18] | -1.17 | 0.244 
## Floral simpson index site |       -0.41 | 0.23 | [-0.85, 0.03] | -1.81 | 0.071 
## Days since start          |        0.12 | 0.15 | [-0.18, 0.41] |  0.77 | 0.443 
## dm wind velocity          |       -0.28 | 0.20 | [-0.67, 0.11] | -1.39 | 0.165 
## dm temperature            |        0.01 | 0.19 | [-0.36, 0.39] |  0.08 | 0.940 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        9.72 | [5.92, 15.97]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.18 | [0.02, 1.73]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_cam_diversity_simpson_mod1_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_cam_diversity_simpson_mod1_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.057
##           p-value = 0.744
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_simpson_mod1_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.76 [2.57, 5.85]         1.94      0.27
##  Floral_simpson_index_site 3.76 [2.57, 5.85]         1.94      0.27
##           Days_since_start 1.62 [1.26, 2.47]         1.27      0.62
##           dm_wind_velocity 2.85 [2.01, 4.40]         1.69      0.35
##             dm_temperature 2.70 [1.91, 4.15]         1.64      0.37
##  Tolerance 95% CI
##      [0.17, 0.39]
##      [0.17, 0.39]
##      [0.41, 0.79]
##      [0.23, 0.50]
##      [0.24, 0.52]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_simpson_mod1_beta_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_simpson_mod1_beta)
plot(flower_cam_diversity_simpson_mod1_beta_sim_res)

#remove dm temperature (p= 0.940        for flower_cam_diversity_simpson_mod1_beta)
flower_cam_diversity_simpson_mod2_beta <- glmmTMB(simpson_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         + Days_since_start  
                                         + dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity, 
                                       family = beta_family())
summary(flower_cam_diversity_simpson_mod2_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     Days_since_start + dm_wind_velocity + (1 | site)
## Data: flower_cam_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -30.0    -18.2     22.0    -44.0       33 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.0311   0.1764  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for beta family (): 9.72 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 0.7599     0.1206   6.299 2.99e-10 ***
## average_flower_cover       -0.2639     0.2202  -1.199   0.2307    
## Floral_simpson_index_site  -0.4110     0.2225  -1.847   0.0647 .  
## Days_since_start            0.1194     0.1409   0.848   0.3966    
## dm_wind_velocity           -0.2868     0.1417  -2.024   0.0430 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_cam_diversity_simpson_mod2_beta)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |         95% CI |     z |      p
## --------------------------------------------------------------------------------
## (Intercept)               |        0.76 | 0.12 | [ 0.52,  1.00] |  6.30 | < .001
## average flower cover      |       -0.26 | 0.22 | [-0.70,  0.17] | -1.20 | 0.231 
## Floral simpson index site |       -0.41 | 0.22 | [-0.85,  0.03] | -1.85 | 0.065 
## Days since start          |        0.12 | 0.14 | [-0.16,  0.40] |  0.85 | 0.397 
## dm wind velocity          |       -0.29 | 0.14 | [-0.56, -0.01] | -2.02 | 0.043 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        9.72 | [5.92, 15.97]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.18 | [0.02, 1.73]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_cam_diversity_simpson_mod2_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#check for singularity
performance::check_singularity(flower_cam_diversity_simpson_mod2_beta)
## [1] FALSE
#overdispersion
check_overdispersion(flower_cam_diversity_simpson_mod2_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.055
##           p-value = 0.776
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_simpson_mod2_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.41 [2.33, 5.38]         1.85      0.29
##  Floral_simpson_index_site 3.65 [2.48, 5.77]         1.91      0.27
##           Days_since_start 1.42 [1.14, 2.24]         1.19      0.70
##           dm_wind_velocity 1.45 [1.16, 2.27]         1.20      0.69
##  Tolerance 95% CI
##      [0.19, 0.43]
##      [0.17, 0.40]
##      [0.45, 0.88]
##      [0.44, 0.86]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_simpson_mod2_beta_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_simpson_mod2_beta)
plot(flower_cam_diversity_simpson_mod2_beta_sim_res)

#remove days since start (p= 0.397     for flower_cam_diversity_simpson_mod2_beta)
flower_cam_diversity_simpson_mod3_beta <- glmmTMB(simpson_diversity 
                                         ~ average_flower_cover 
                                         + Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         + dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity, 
                                       family = beta_family())
summary(flower_cam_diversity_simpson_mod3_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ average_flower_cover + Floral_simpson_index_site +  
##     dm_wind_velocity + (1 | site)
## Data: flower_cam_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -31.3    -21.2     21.7    -43.3       34 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.04808  0.2193  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for beta family (): 9.84 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 0.7604     0.1278   5.948 2.71e-09 ***
## average_flower_cover       -0.2886     0.2275  -1.268   0.2047    
## Floral_simpson_index_site  -0.4457     0.2288  -1.948   0.0514 .  
## dm_wind_velocity           -0.2318     0.1332  -1.740   0.0818 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_cam_diversity_simpson_mod3_beta)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI |     z |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        0.76 | 0.13 | [ 0.51, 1.01] |  5.95 | < .001
## average flower cover      |       -0.29 | 0.23 | [-0.73, 0.16] | -1.27 | 0.205 
## Floral simpson index site |       -0.45 | 0.23 | [-0.89, 0.00] | -1.95 | 0.051 
## dm wind velocity          |       -0.23 | 0.13 | [-0.49, 0.03] | -1.74 | 0.082 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        9.84 | [6.06, 15.99]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.22 | [0.05, 0.95]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_cam_diversity_simpson_mod3_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_cam_diversity_simpson_mod3_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.062
##           p-value = 0.744
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_simpson_mod3_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF   VIF 95% CI Increased SE Tolerance
##       average_flower_cover 3.23 [2.20, 5.15]         1.80      0.31
##  Floral_simpson_index_site 3.40 [2.30, 5.43]         1.84      0.29
##           dm_wind_velocity 1.13 [1.01, 2.56]         1.06      0.89
##  Tolerance 95% CI
##      [0.19, 0.46]
##      [0.18, 0.44]
##      [0.39, 0.99]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_simpson_mod3_beta_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_simpson_mod3_beta)
plot(flower_cam_diversity_simpson_mod3_beta_sim_res)
## qu = 0.25, log(sigma) = -2.746367 : outer Newton did not converge fully.

#remove average flower cover (p= 0.205         for flower_cam_diversity_simpson_mod3_beta)
flower_cam_diversity_simpson_mod4_beta <- glmmTMB(simpson_diversity 
                                         #~ average_flower_cover 
                                         ~ Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         + dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity, 
                                       family = beta_family())
summary(flower_cam_diversity_simpson_mod4_beta)
##  Family: beta  ( logit )
## Formula:          
## simpson_diversity ~ Floral_simpson_index_site + dm_wind_velocity +  
##     (1 | site)
## Data: flower_cam_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -31.8    -23.4     20.9    -41.8       35 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.06915  0.263   
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for beta family (): 9.77 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 0.7584     0.1369   5.541    3e-08 ***
## Floral_simpson_index_site  -0.2056     0.1382  -1.487    0.137    
## dm_wind_velocity           -0.1911     0.1390  -1.374    0.169    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_cam_diversity_simpson_mod4_beta)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI |     z |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        0.76 | 0.14 | [ 0.49, 1.03] |  5.54 | < .001
## Floral simpson index site |       -0.21 | 0.14 | [-0.48, 0.07] | -1.49 | 0.137 
## dm wind velocity          |       -0.19 | 0.14 | [-0.46, 0.08] | -1.37 | 0.169 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        9.77 | [6.00, 15.93]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.26 | [0.08, 0.87]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_cam_diversity_simpson_mod4_beta, verbose = T)
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_cam_diversity_simpson_mod4_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.086
##           p-value = 0.672
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_simpson_mod4_beta)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##                       Term  VIF    VIF 95% CI Increased SE Tolerance
##  Floral_simpson_index_site 1.05 [1.00, 12.26]         1.03      0.95
##           dm_wind_velocity 1.05 [1.00, 12.26]         1.03      0.95
##  Tolerance 95% CI
##      [0.08, 1.00]
##      [0.08, 1.00]
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_simpson_mod4_beta_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_simpson_mod4_beta)
plot(flower_cam_diversity_simpson_mod4_beta_sim_res)

#remove dm wind (p= 0.169       for flower_cam_diversity_simpson_mod4_beta)
flower_cam_diversity_simpson_mod5_beta <- glmmTMB(simpson_diversity 
                                         #~ average_flower_cover 
                                        ~ Floral_simpson_index_site 
                                         #+ Days_since_start  
                                         #+ dm_wind_velocity  
                                         #+ dm_temperature  
                                         + (1 | site),
                                       data = flower_cam_diversity, 
                                       family = beta_family(link = "logit"))
summary(flower_cam_diversity_simpson_mod5_beta)
##  Family: beta  ( logit )
## Formula:          simpson_diversity ~ Floral_simpson_index_site + (1 | site)
## Data: flower_cam_diversity
## 
##      AIC      BIC   logLik deviance df.resid 
##    -32.2    -25.4     20.1    -40.2       36 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  site   (Intercept) 0.107    0.3271  
## Number of obs: 40, groups:  site, 9
## 
## Dispersion parameter for beta family (): 9.84 
## 
## Conditional model:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 0.7643     0.1516   5.043 4.59e-07 ***
## Floral_simpson_index_site  -0.1624     0.1470  -1.105    0.269    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
parameters(flower_cam_diversity_simpson_mod5_beta)
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |        95% CI |     z |      p
## -------------------------------------------------------------------------------
## (Intercept)               |        0.76 | 0.15 | [ 0.47, 1.06] |  5.04 | < .001
## Floral simpson index site |       -0.16 | 0.15 | [-0.45, 0.13] | -1.10 | 0.269 
## 
## # Dispersion
## 
## Parameter   | Coefficient |        95% CI
## -----------------------------------------
## (Intercept) |        9.84 | [6.04, 16.02]
## 
## # Random Effects Variances
## 
## Parameter            | Coefficient |       95% CI
## -------------------------------------------------
## SD (Intercept: site) |        0.33 | [0.13, 0.82]
## 
## Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
##   using a Wald z-distribution approximation.
#check the model
check_model(flower_cam_diversity_simpson_mod5_beta, verbose = T)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## `check_outliers()` does not yet support models of class `glmmTMB`.

#overdispersion
check_overdispersion(flower_cam_diversity_simpson_mod5_beta)
## # Overdispersion test
## 
##  dispersion ratio = 1.114
##           p-value = 0.568
## No overdispersion detected.
#collinearity
check_collinearity(flower_cam_diversity_simpson_mod5_beta)
## Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
# dharma package - simulate residuals and check model assumptions
flower_cam_diversity_simpson_mod5_beta_sim_res <- simulateResiduals(fittedModel = flower_cam_diversity_simpson_mod5_beta)
plot(flower_cam_diversity_simpson_mod5_beta_sim_res)

IV.D.4.a. Compare the models with the performance package

# Compare the models with the performance package
flower_cam_diversity_simpson_comp1 <- compare_performance(flower_cam_diversity_simpson_mod1_beta, flower_cam_diversity_simpson_mod2_beta, flower_cam_diversity_simpson_mod3_beta, flower_cam_diversity_simpson_mod4_beta, flower_cam_diversity_simpson_mod5_beta,
                                                 metrics = c("AICc", "BIC", "R2", "ICC", "RMSE"))
# Print the comparison table
print(flower_cam_diversity_simpson_comp1)
## # Comparison of Model Performance Indices
## 
## Name                                   |   Model | AICc (weights)
## -----------------------------------------------------------------
## flower_cam_diversity_simpson_mod1_beta | glmmTMB |  -23.4 (0.011)
## flower_cam_diversity_simpson_mod2_beta | glmmTMB |  -26.5 (0.051)
## flower_cam_diversity_simpson_mod3_beta | glmmTMB |  -28.8 (0.159)
## flower_cam_diversity_simpson_mod4_beta | glmmTMB |  -30.1 (0.299)
## flower_cam_diversity_simpson_mod5_beta | glmmTMB |  -31.0 (0.480)
## 
## Name                                   | BIC (weights) | R2 (cond.)
## -------------------------------------------------------------------
## flower_cam_diversity_simpson_mod1_beta | -14.5 (0.003) |      0.761
## flower_cam_diversity_simpson_mod2_beta | -18.2 (0.018) |      0.760
## flower_cam_diversity_simpson_mod3_beta | -21.2 (0.081) |      0.768
## flower_cam_diversity_simpson_mod4_beta | -23.4 (0.240) |      0.759
## flower_cam_diversity_simpson_mod5_beta | -25.4 (0.658) |      0.761
## 
## Name                                   | R2 (marg.) |   ICC |  RMSE
## -------------------------------------------------------------------
## flower_cam_diversity_simpson_mod1_beta |      0.586 | 0.422 | 0.142
## flower_cam_diversity_simpson_mod2_beta |      0.585 | 0.422 | 0.142
## flower_cam_diversity_simpson_mod3_beta |      0.502 | 0.533 | 0.139
## flower_cam_diversity_simpson_mod4_beta |      0.365 | 0.620 | 0.138
## flower_cam_diversity_simpson_mod5_beta |      0.153 | 0.718 | 0.137

IV.D.4.b. Visualize the model results

plot_model(flower_cam_diversity_simpson_mod1_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_simpson_mod2_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_simpson_mod3_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_simpson_mod4_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_simpson_mod5_beta , type = "est", show.values = TRUE, value.offset = .3)

plot_model(flower_cam_diversity_simpson_mod1_beta, 
           type = "est", 
           show.values = TRUE, 
           value.offset = 0.3,
           #sort.est = TRUE,
           axis.labels = c(
             "Temperature",
             "Wind Velocity (km/h)",
             "Days since start",
             "Floral Simpson Index",
             "Average Flower Cover %" )) +
    labs(title = "Flower Camera: Simpson Diversity Index", x = "Predictors",y = "Estimate") + 
    theme(axis.text.y = element_text(hjust = 0))  # 0 = left, 1 = right

IV.D.4.c. Interpretation of the model results

#remove all objects starting with flower_cam_diversity_
rm(list = ls(pattern = "^flower_cam_diversity_"))

Citations

# Get attached packages
attached_pkgs <- sessionInfo()$otherPkgs

# Extract package names
pkg_names <- names(attached_pkgs)
print(pkg_names)
##  [1] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
##  [6] "tidyr"        "tibble"       "tidyverse"    "shiny"        "sjPlot"      
## [11] "DHARMa"       "arm"          "MASS"         "glmmTMB"      "ggeffects"   
## [16] "lme4"         "Matrix"       "see"          "report"       "parameters"  
## [21] "performance"  "modelbased"   "insight"      "effectsize"   "datawizard"  
## [26] "correlation"  "bayestestR"   "easystats"    "factoextra"   "ggplot2"     
## [31] "corrplot"     "Hmisc"        "lubridate"    "cowplot"      "patchwork"   
## [36] "RColorBrewer" "paletteer"    "vegan"        "lattice"      "permute"
# Get citations
pkg_citations <- lapply(pkg_names, function(pkg) {
  tryCatch(toBibtex(citation(pkg)), error = function(e) NULL)
})

# Combine and print
cat(unlist(pkg_citations), sep = "\n\n")
## @Manual{,
## 
##   title = {forcats: Tools for Working with Categorical Variables (Factors)},
## 
##   author = {Hadley Wickham},
## 
##   year = {2023},
## 
##   note = {R package version 1.0.0},
## 
##   url = {https://CRAN.R-project.org/package=forcats},
## 
## }
## 
## @Manual{,
## 
##   title = {stringr: Simple, Consistent Wrappers for Common String Operations},
## 
##   author = {Hadley Wickham},
## 
##   year = {2023},
## 
##   note = {R package version 1.5.1},
## 
##   url = {https://CRAN.R-project.org/package=stringr},
## 
## }
## 
## @Manual{,
## 
##   title = {dplyr: A Grammar of Data Manipulation},
## 
##   author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan},
## 
##   year = {2023},
## 
##   note = {R package version 1.1.4},
## 
##   url = {https://CRAN.R-project.org/package=dplyr},
## 
## }
## 
## @Manual{,
## 
##   title = {purrr: Functional Programming Tools},
## 
##   author = {Hadley Wickham and Lionel Henry},
## 
##   year = {2023},
## 
##   note = {R package version 1.0.2},
## 
##   url = {https://CRAN.R-project.org/package=purrr},
## 
## }
## 
## @Manual{,
## 
##   title = {readr: Read Rectangular Text Data},
## 
##   author = {Hadley Wickham and Jim Hester and Jennifer Bryan},
## 
##   year = {2024},
## 
##   note = {R package version 2.1.5},
## 
##   url = {https://CRAN.R-project.org/package=readr},
## 
## }
## 
## @Manual{,
## 
##   title = {tidyr: Tidy Messy Data},
## 
##   author = {Hadley Wickham and Davis Vaughan and Maximilian Girlich},
## 
##   year = {2024},
## 
##   note = {R package version 1.3.1},
## 
##   url = {https://CRAN.R-project.org/package=tidyr},
## 
## }
## 
## @Manual{,
## 
##   title = {tibble: Simple Data Frames},
## 
##   author = {Kirill Müller and Hadley Wickham},
## 
##   year = {2023},
## 
##   note = {R package version 3.2.1},
## 
##   url = {https://CRAN.R-project.org/package=tibble},
## 
## }
## 
## @Article{,
## 
##   title = {Welcome to the {tidyverse}},
## 
##   author = {Hadley Wickham and Mara Averick and Jennifer Bryan and Winston Chang and Lucy D'Agostino McGowan and Romain François and Garrett Grolemund and Alex Hayes and Lionel Henry and Jim Hester and Max Kuhn and Thomas Lin Pedersen and Evan Miller and Stephan Milton Bache and Kirill Müller and Jeroen Ooms and David Robinson and Dana Paige Seidel and Vitalie Spinu and Kohske Takahashi and Davis Vaughan and Claus Wilke and Kara Woo and Hiroaki Yutani},
## 
##   year = {2019},
## 
##   journal = {Journal of Open Source Software},
## 
##   volume = {4},
## 
##   number = {43},
## 
##   pages = {1686},
## 
##   doi = {10.21105/joss.01686},
## 
## }
## 
## @Manual{,
## 
##   title = {shiny: Web Application Framework for R},
## 
##   author = {Winston Chang and Joe Cheng and JJ Allaire and Carson Sievert and Barret Schloerke and Yihui Xie and Jeff Allen and Jonathan McPherson and Alan Dipert and Barbara Borges},
## 
##   year = {2024},
## 
##   note = {R package version 1.8.1.1},
## 
##   url = {https://CRAN.R-project.org/package=shiny},
## 
## }
## 
## @Manual{,
## 
##   title = {sjPlot: Data Visualization for Statistics in Social Science},
## 
##   author = {Daniel Lüdecke},
## 
##   year = {2024},
## 
##   note = {R package version 2.8.17},
## 
##   url = {https://CRAN.R-project.org/package=sjPlot},
## 
## }
## 
## @Manual{,
## 
##   title = {DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed)
## Regression Models},
## 
##   author = {Florian Hartig},
## 
##   year = {2024},
## 
##   note = {R package version 0.4.7},
## 
##   url = {https://CRAN.R-project.org/package=DHARMa},
## 
## }
## 
## @Manual{,
## 
##   title = {arm: Data Analysis Using Regression and Multilevel/Hierarchical
## Models},
## 
##   author = {Andrew Gelman and Yu-Sung Su},
## 
##   year = {2024},
## 
##   note = {R package version 1.14-4},
## 
##   url = {https://CRAN.R-project.org/package=arm},
## 
## }
## 
## @Book{,
## 
##   title = {Modern Applied Statistics with S},
## 
##   author = {W. N. Venables and B. D. Ripley},
## 
##   publisher = {Springer},
## 
##   edition = {Fourth},
## 
##   address = {New York},
## 
##   year = {2002},
## 
##   note = {ISBN 0-387-95457-0},
## 
##   url = {https://www.stats.ox.ac.uk/pub/MASS4/},
## 
## }
## 
## @Article{,
## 
##   author = {Mollie E. Brooks and Kasper Kristensen and Koen J. {van Benthem} and Arni Magnusson and Casper W. Berg and Anders Nielsen and Hans J. Skaug and Martin Maechler and Benjamin M. Bolker},
## 
##   title = {{glmmTMB} Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling},
## 
##   year = {2017},
## 
##   journal = {The R Journal},
## 
##   doi = {10.32614/RJ-2017-066},
## 
##   pages = {378--400},
## 
##   volume = {9},
## 
##   number = {2},
## 
## }
## 
## @Article{,
## 
##   title = {ggeffects: Tidy Data Frames of Marginal Effects from Regression Models.},
## 
##   volume = {3},
## 
##   doi = {10.21105/joss.00772},
## 
##   number = {26},
## 
##   journal = {Journal of Open Source Software},
## 
##   author = {Daniel Lüdecke},
## 
##   year = {2018},
## 
##   pages = {772},
## 
## }
## 
## @Article{,
## 
##   title = {Fitting Linear Mixed-Effects Models Using {lme4}},
## 
##   author = {Douglas Bates and Martin M{\"a}chler and Ben Bolker and Steve Walker},
## 
##   journal = {Journal of Statistical Software},
## 
##   year = {2015},
## 
##   volume = {67},
## 
##   number = {1},
## 
##   pages = {1--48},
## 
##   doi = {10.18637/jss.v067.i01},
## 
## }
## 
## @Manual{,
## 
##   title = {Matrix: Sparse and Dense Matrix Classes and Methods},
## 
##   author = {Douglas Bates and Martin Maechler and Mikael Jagan},
## 
##   year = {2024},
## 
##   note = {R package version 1.7-0},
## 
##   url = {https://CRAN.R-project.org/package=Matrix},
## 
## }
## 
## @Article{,
## 
##   title = {{see}: An {R} Package for Visualizing Statistical Models},
## 
##   author = {Daniel Lüdecke and Indrajeet Patil and Mattan S. Ben-Shachar and Brenton M. Wiernik and Philip Waggoner and Dominique Makowski},
## 
##   journal = {Journal of Open Source Software},
## 
##   year = {2021},
## 
##   volume = {6},
## 
##   number = {64},
## 
##   pages = {3393},
## 
##   doi = {10.21105/joss.03393},
## 
## }
## 
## @Article{,
## 
##   title = {Automated Results Reporting as a Practical Tool to Improve Reproducibility and Methodological Best Practices Adoption},
## 
##   author = {Dominique Makowski and Daniel Lüdecke and Indrajeet Patil and Rémi Thériault and Mattan S. Ben-Shachar and Brenton M. Wiernik},
## 
##   year = {2023},
## 
##   journal = {CRAN},
## 
##   url = {https://easystats.github.io/report/},
## 
## }
## 
## @Article{,
## 
##   title = {Extracting, Computing and Exploring the Parameters of Statistical Models using {R}.},
## 
##   volume = {5},
## 
##   doi = {10.21105/joss.02445},
## 
##   number = {53},
## 
##   journal = {Journal of Open Source Software},
## 
##   author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Dominique Makowski},
## 
##   year = {2020},
## 
##   pages = {2445},
## 
## }
## 
## @Article{,
## 
##   title = {{performance}: An {R} Package for Assessment, Comparison and Testing of Statistical Models},
## 
##   author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Philip Waggoner and Dominique Makowski},
## 
##   year = {2021},
## 
##   journal = {Journal of Open Source Software},
## 
##   volume = {6},
## 
##   number = {60},
## 
##   pages = {3139},
## 
##   doi = {10.21105/joss.03139},
## 
## }
## 
## @Article{,
## 
##   title = {Estimation of Model-Based Predictions, Contrasts and Means.},
## 
##   author = {Dominique Makowski and Mattan S. Ben-Shachar and Indrajeet Patil and Daniel Lüdecke},
## 
##   journal = {CRAN},
## 
##   year = {2020},
## 
##   url = {https://github.com/easystats/modelbased},
## 
## }
## 
## @Article{,
## 
##   title = {{insight}: A Unified Interface to Access Information from Model Objects in {R}.},
## 
##   volume = {4},
## 
##   doi = {10.21105/joss.01412},
## 
##   number = {38},
## 
##   journal = {Journal of Open Source Software},
## 
##   author = {Daniel Lüdecke and Philip Waggoner and Dominique Makowski},
## 
##   year = {2019},
## 
##   pages = {1412},
## 
## }
## 
## @Article{,
## 
##   title = {{e}ffectsize: Estimation of Effect Size Indices and Standardized Parameters},
## 
##   author = {Mattan S. Ben-Shachar and Daniel Lüdecke and Dominique Makowski},
## 
##   year = {2020},
## 
##   journal = {Journal of Open Source Software},
## 
##   volume = {5},
## 
##   number = {56},
## 
##   pages = {2815},
## 
##   publisher = {The Open Journal},
## 
##   doi = {10.21105/joss.02815},
## 
##   url = {https://doi.org/10.21105/joss.02815},
## 
## }
## 
## @Article{,
## 
##   title = {{datawizard}: An {R} Package for Easy Data Preparation and Statistical Transformations},
## 
##   author = {Indrajeet Patil and Dominique Makowski and Mattan S. Ben-Shachar and Brenton M. Wiernik and Etienne Bacher and Daniel Lüdecke},
## 
##   journal = {Journal of Open Source Software},
## 
##   year = {2022},
## 
##   volume = {7},
## 
##   number = {78},
## 
##   pages = {4684},
## 
##   doi = {10.21105/joss.04684},
## 
## }
## 
## @Misc{correlationPackage,
## 
##   title = {{{correlation}}: Methods for Correlation Analysis},
## 
##   shorttitle = {{{correlation}}},
## 
##   author = {Dominique Makowski and Brenton M. Wiernik and Indrajeet Patil and Daniel Lüdecke and Mattan S. Ben-Shachar},
## 
##   year = {2022},
## 
##   month = {oct},
## 
##   note = {Version 0.8.3},
## 
##   url = {https://CRAN.R-project.org/package=correlation},
## 
## }
## 
## 
## 
## @Article{correlationArticle,
## 
##   title = {Methods and Algorithms for Correlation Analysis in {{R}}},
## 
##   author = {Dominique Makowski and Mattan S. Ben-Shachar and Indrajeet Patil and Daniel Lüdecke},
## 
##   doi = {10.21105/joss.02306},
## 
##   year = {2020},
## 
##   journal = {Journal of Open Source Software},
## 
##   number = {51},
## 
##   volume = {5},
## 
##   pages = {2306},
## 
##   url = {https://joss.theoj.org/papers/10.21105/joss.02306},
## 
## }
## 
## @Article{,
## 
##   title = {bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework.},
## 
##   author = {Dominique Makowski and Mattan S. Ben-Shachar and Daniel Lüdecke},
## 
##   journal = {Journal of Open Source Software},
## 
##   doi = {10.21105/joss.01541},
## 
##   year = {2019},
## 
##   number = {40},
## 
##   volume = {4},
## 
##   pages = {1541},
## 
##   url = {https://joss.theoj.org/papers/10.21105/joss.01541},
## 
## }
## 
## @Article{,
## 
##   title = {easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting},
## 
##   author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Brenton M. Wiernik and Etienne Bacher and Rémi Thériault and Dominique Makowski},
## 
##   journal = {CRAN},
## 
##   doi = {10.32614/CRAN.package.easystats},
## 
##   year = {2022},
## 
##   note = {R package},
## 
##   url = {https://easystats.github.io/easystats/},
## 
## }
## 
## @Manual{,
## 
##   title = {factoextra: Extract and Visualize the Results of Multivariate Data Analyses},
## 
##   author = {Alboukadel Kassambara and Fabian Mundt},
## 
##   year = {2020},
## 
##   note = {R package version 1.0.7},
## 
##   url = {https://CRAN.R-project.org/package=factoextra},
## 
## }
## 
## @Book{,
## 
##   author = {Hadley Wickham},
## 
##   title = {ggplot2: Elegant Graphics for Data Analysis},
## 
##   publisher = {Springer-Verlag New York},
## 
##   year = {2016},
## 
##   isbn = {978-3-319-24277-4},
## 
##   url = {https://ggplot2.tidyverse.org},
## 
## }
## 
## @Manual{corrplot2024,
## 
##   title = {R package 'corrplot': Visualization of a Correlation Matrix},
## 
##   author = {Taiyun Wei and Viliam Simko},
## 
##   year = {2024},
## 
##   note = {(Version 0.95)},
## 
##   url = {https://github.com/taiyun/corrplot},
## 
## }
## 
## @Manual{,
## 
##   title = {Hmisc: Harrell Miscellaneous},
## 
##   author = {Frank E {Harrell Jr}},
## 
##   year = {2025},
## 
##   note = {R package version 5.2-3},
## 
##   url = {https://CRAN.R-project.org/package=Hmisc},
## 
## }
## 
## @Article{,
## 
##   title = {Dates and Times Made Easy with {lubridate}},
## 
##   author = {Garrett Grolemund and Hadley Wickham},
## 
##   journal = {Journal of Statistical Software},
## 
##   year = {2011},
## 
##   volume = {40},
## 
##   number = {3},
## 
##   pages = {1--25},
## 
##   url = {https://www.jstatsoft.org/v40/i03/},
## 
## }
## 
## @Manual{,
## 
##   title = {cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'},
## 
##   author = {Claus O. Wilke},
## 
##   year = {2024},
## 
##   note = {R package version 1.1.3},
## 
##   url = {https://CRAN.R-project.org/package=cowplot},
## 
## }
## 
## @Manual{,
## 
##   title = {patchwork: The Composer of Plots},
## 
##   author = {Thomas Lin Pedersen},
## 
##   year = {2024},
## 
##   note = {R package version 1.3.0},
## 
##   url = {https://CRAN.R-project.org/package=patchwork},
## 
## }
## 
## @Manual{,
## 
##   title = {RColorBrewer: ColorBrewer Palettes},
## 
##   author = {Erich Neuwirth},
## 
##   year = {2022},
## 
##   note = {R package version 1.1-3},
## 
##   url = {https://CRAN.R-project.org/package=RColorBrewer},
## 
## }
## 
## @Manual{,
## 
##   title = {paletteer: Comprehensive Collection of Color Palettes},
## 
##   author = {Emil Hvitfeldt},
## 
##   year = {2021},
## 
##   note = {R package version 1.3.0},
## 
##   url = {https://github.com/EmilHvitfeldt/paletteer},
## 
## }
## 
## @Manual{,
## 
##   title = {vegan: Community Ecology Package},
## 
##   author = {Jari Oksanen and Gavin L. Simpson and F. Guillaume Blanchet and Roeland Kindt and Pierre Legendre and Peter R. Minchin and R.B. O'Hara and Peter Solymos and M. Henry H. Stevens and Eduard Szoecs and Helene Wagner and Matt Barbour and Michael Bedward and Ben Bolker and Daniel Borcard and Gustavo Carvalho and Michael Chirico and Miquel {De Caceres} and Sebastien Durand and Heloisa Beatriz Antoniazi Evangelista and Rich FitzJohn and Michael Friendly and Brendan Furneaux and Geoffrey Hannigan and Mark O. Hill and Leo Lahti and Dan McGlinn and Marie-Helene Ouellette and Eduardo {Ribeiro Cunha} and Tyler Smith and Adrian Stier and Cajo J.F. {Ter Braak} and James Weedon},
## 
##   year = {2024},
## 
##   note = {R package version 2.6-6.1},
## 
##   url = {https://CRAN.R-project.org/package=vegan},
## 
## }
## 
## @Book{,
## 
##   title = {Lattice: Multivariate Data Visualization with R},
## 
##   author = {Deepayan Sarkar},
## 
##   year = {2008},
## 
##   publisher = {Springer},
## 
##   address = {New York},
## 
##   isbn = {978-0-387-75968-5},
## 
##   url = {http://lmdvr.r-forge.r-project.org},
## 
## }
## 
## @Manual{,
## 
##   title = {permute: Functions for Generating Restricted Permutations of Data},
## 
##   author = {Gavin L. Simpson},
## 
##   year = {2022},
## 
##   note = {R package version 0.9-7},
## 
##   url = {https://CRAN.R-project.org/package=permute},
## 
## }
# Write to a .bib file here "C:\Users\Almas\Desktop\UNI_LEIPSI\Thesis\Thesis_Rproject\data"
writeLines(unlist(pkg_citations), "C:/Users/Almas/Desktop/UNI_LEIPSI/Thesis/Thesis_Rproject/data/r_packages.bib")